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To my parents 


Preface 


This textbook has been developed for use in the two-semester intro- 
ductory course in mathematical analysis at the Massachusetts Institute of 
Technology. The aim of the course is to introduce the student to basic con- 
cepts, principles, and methods of mathematical analysis. 

The presumed mathematical background of the students is a solid 
calculus course covering one and (some of) several variables, plus (perhaps) 
elementary differential equations and linear algebra. The linear algebra 
background is not necessary until the second semester, since it enters the 
early chapters only through certain examples and exercises which utilize 
matrices. At M.I.T. the introductory calculus course is condensed into one 
year, after which the student has available a one-semester course in dif- 
ferential equations and linear algebra. Thus, over half the students in the 
course are sophomores. Since many students enter M.I.T. having had a 
serious calculus course in high school, there are quite a few freshmen in 
the course. The remainder of the students tend to be juniors, seniors, or 
graduate students in fields such as physics or electrical engineering. Since 
very little prior experience with rigorous mathematical thought is as- 
sumed, it has been our custom to augment the lectures by structured 
tutorial sessions designed to help the students in learning to deal with 
precise mathematical definitions and proofs. It is to be expected that at 
many institutions the text would be suitable for a junior, senior, or graduate 
course in analysis, since it does assume a considerable technical facility 
with elementary mathematics as well as an affinity for mathematical 
thought. 

The presentation differs from that found in existing texts in two ways. 
First, a concerted effort is made to keep the introductions to real and com- 
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plex analysis close together. These subjects have been separated in the 
curriculum for a number of years, thus tending to delay the introduction 
to complex function theory. Second, the generalizations beyond R” are 
presented for subsets of normed linear spaces rather than for metric spaces. 
The pedagogical advantage of this is that the original material can be 
developed on the familiar terrain of Euclidean space and then simply 
observed to be largely valid for normed linear spaces, where the symbolism 
is just like that of R”. The students are prepared for the generalization in 
much the same way that high-school algebra prepares one for manipula- 
tion in a commutative ring with identity. 

The first semester covers the bulk of the first five chapters. It empha- 
sizes the Four C’s: completeness, convergence, compactness, and con- 
tinuity. The basic results are presented for subsets of and functions on 
Euclidean space of n dimensions. This presentation includes (of course) a 
rigorous review of the intellectual skeleton of calculus, placing greater 
emphasis on power series expansions than one normally can in a calculus 
course. The discussion proceeds (in Chapter 5) into complex power series 
and an introduction to the theory of complex-analytic functions. The 
review of linear geometry in Section 1.6 is usually omitted from the formal 
structure of the first semester. The instructor who is pressed for time or 
who is predisposed to separate real and complex analysis may also omit 
all or part of Sections 5.5-5.10 on analytic functions and Fourier series 
without interrupting the flow of the remainder of the text. 

The second semester begins with Chapter 6. It reviews the main results 
of the first semester, the review being carried out in the context of (subsets 
of and functions on) normed linear spaces. The author has found that the 
student is readily able to absorb the fact that many of the arguments he 
or she has been exposed to are formal and are therefore valid in the more 
general context. It is then emphasized that two of the most crucial results 
from the first semester—the completeness of R” and the Heine-Borel 
theorem—depend on finite-dimensionality. This leads naturally to a dis- 
cussion of (i) complete (Banach) spaces, the Baire category theorem and 
fixed points of contractions, and (ii) compact subsets of various normed 
linear spaces, in particular, equicontinuity and Ascoli’s theorem. From 
there the course moves to the Lebesgue integral on R’, which is developed 
by completing the space of continuous functions of compact support. Most 
of the basic properties of integral and measure are discussed, and a short 
presentation of orthogonal expansions (especially Fourier series) is in- 
cluded. The final chapter of the notes deals with differentiable maps on 
R", the implicit and inverse function theorems, and the change of variable 
theorem. This chapter may be presented earlier if the instructor finds it 
desirable, since the only dependence on Lebesgue integration is the proof 
of the change of variable theorem. 

A few final remarks. Some mathematicians will look at these notes 
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and say, “How can you teach an introductory course in analysis which 
never mentions partial differential equations or calculus of variations?” 
Others will ask, “How can you teach a basic course in analysis which 
devotes so little attention to applications, either to mathematics or to other 
fields of science?” The answer is that there is no such thing as the intro- 
ductory course in analysis. The subject is too large and too important to 
allow for that. The three most viable foci for organization of an intro- 
ductory course seem to be (i) emphasis on general concepts and principles, 
(ii) emphasis on hard mathematical analysis (the source of the general 
ideas), and (iii) emphasis on applications to science and engineering. This 
text was developed for the first type of course. It can be very valuable for a 
certain category of students, principally the students going on to graduate 
school in mathematics, physics, or (abstract) electrical engineering, etc. 
It is not, and was not intended to be, right for all students who may need 
some advanced calculus or analysis beyond the elementary level. 

Thanks are due to many people who have contributed to the develop- 
ment of this text over the last eight years. Colleagues too numerous to 
mention used the classroom notes and pointed out errors or suggested 
improvements. Three must be singled out: Steven Minsker, David 
Ragozin, and Donald Wilken. Each of them assisted the author in improv- 
ing the notes and managing the pedagogical affairs of the M.I.T. course. I 
am especially grateful to David Ragozin, who wrote an intermediate ver- 
sion of the chapter on Lebesgue integration. I am indebted to Mrs. Sophia 
Koulouras, who typed the original notes, and to Miss Viola Wiley, who 
typed the revision and the final manuscript. Finally, my thanks to Art 
Wester and the staff of Prentice-Hall, Inc. 


KENNETH HOFFMAN 
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Preface 
to the Student 


This textbook will introduce you to many of the general principles 
of mathematical analysis. It assumes that you have a mathematical back- 
ground which includes a solid course (at least one year) in the calculus of 
functions of one and several variables, as well as a short course in dif- 
ferential equations. It would be helpful if you have been exposed to intro- 
ductory linear algebra since many of the exercises and examples involve 
matrices. The material necessary for following these exercises and examples 
is summarized in Section 1.6, but a linear algebra background is not 
essential for reading the book since it does not enter into the logical devel- 
opment in the text until Chapter 6. 

You will meet a large number of concepts which are new to you, and 
you will be challenged to understand their precise definitions, some of 
their uses, and their general significance. In order to understand the mean- 
ing of this in quantitative terms, thumb through the Index and see how 
many of the terms listed there you can describe precisely. But it is the 
qualitative impact of the definitions which will loom largest in your ex- 
perience with this book. You may find that you are having difficulty fol- 
lowing the “proofs” presented in the book or even in understanding what 
a “proof” is. When this happens, look to the definitions because the chances 
are that your real difficulty lies in the fact that you have only a hazy 
understanding of the definitions of basic concepts or are suffering from a 
lack of familiarity with definitions which mean exactly what they say, 
nothing less and nothing more. 

You will also learn a lot of rich and beautiful mathematics. To make 
the learning task more manageable, the notes have been provided with 
supplementary material and mechanisms which you should utilize: 
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!. Appendix: Note that the text proper is followed by an Appendix 
which discusses sets, functions, and a bit about cardinality (finite, infinite, 
countable, and uncountable sets). Read the first part on sets and functions 
and then refer to the remainder when it comes up in the notes. 

2. Bibliography: There is a short bibliography to which you might 
turn if you’re having trouble or want to go beyond the notes. 

3. List of Symbols: If a symbol occurs in the notes which you don’t 
recognize, try this list. 

4. Index: The Index is fairly extensive and can lead you to various 
places where a given concept or result is discussed. 


One last thing. Use the Exercises to test your understanding. Most 
of them come with specific instructions or questions, “Find this”, “Prove 
that”, “True or false?”. Occasionally an exercise will come without in- 
structions and will be a simple declarative sentence, “Every differentiable 
function is continuous”. Such statements are to be proved. Their occur- 
rence reflects nothing more than the author’s attempt to break the mo- 
notony of saying, “Prove that...” over and over again. The exercises 
marked with an asterisk are (usually) extremely difficult. Don’t be dis- 
couraged if some of the ones without asterisks stump you. A few of them 
were significant mathematical discoveries not so long ago. 


KENNETH HOFFMAN 
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and Geometry 


1.1. The Real Number System 


The basic prerequisite for reading this book is a familiarity with the 
real number system. That familiarity should include a facility both with 
the elementary algebra of real numbers and with a few inequalities derived 
from the natural ordering of those numbers. This section is designed to 
emphasize some properties of the number system which may be less fami- | 
liar. 

The first thing we shall do is to list a few fundamental properties of 
algebra and order from which all of the properties of the real number sys- 
tem can be deduced. Let R be the set of real numbers. 


A. Field Axioms. On the set R there are two operations, as follows. 
The first operation, called addition, associates with each pair of elements 
x, yin Ran element (x + y) in R. The second operation, called multiplica- 
tion, associates with each pair of elements x, y in R an element xy in R. 
These two operations have the following properties. 


1. Addition is commutative, 
x+ty=yt+ex 


for all x and y in R. 
2. Addition is associative, 


(x+y) + z=x+(y + 2) 


for all x, y, and z in R. 
3. There is a unique element 0 (zero) in R such that x + 0 = x for 
all x in R. 
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4. To each x in R there corresponds a unique element —x in R such 
that x + (—x) = 0. 
5. Multiplication is commutative, 
| χ ΞΕ yx 
for all x and y in R. : 
6. Multiplication is associative, 


(xy)z = x(yz) 

for all x, y, and zin R. 

7. There is a unique element | (one) in R such that xl = x for all x 
in ΚΝ. 

8. To each non-zero x in R there corresponds a unique element x~! 
(or 1/x) in R such that xx"! = 1. 

9.140. 

10. Multiplication distributes over addition, 


xX(y + 2) = xy + xz 
for all x, y, and z in R. 


B. Order Axioms. There is on Ra relation <, called less than, with 
these properties. 


1. If x and y are in R, one and only one of the following holds: 
x<y; x=y; Vax. 


2.x < yif and only if0 < y— x. 
3. If0 < x and 0 < y, then 0 < (x + y) and O < xy. 


C. Completeness Axiom. If S and T are non-empty subsets of R such 
that 


(i) R= SUT; 
(ii) 5 < t for every s in S and every ¢ in T, | 


then either there exists a largest number in the set S or there exists a 
smallest number in the set T. 


These properties are usually summarized by saying that the set of real 
numbers, with its usual addition, multiplication, and ordering, is (A) a 
field, which (B) is ordered and which (C) is complete in that ordering. 
Briefly, the real number system is a complete ordered field. 

From the field axioms (A), we could deduce the various algebraic 
relations which we shall use; however, we shall not do that. We shall use 
without comment basic identities such as the binomial theorem 

(x+y = Σ (Fler 
k=0 


or the telescoping property of a geometric series 
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Ι -- xt a= —~xltxetxt--- 4+ x"). 
We could define the set of positive integers 
7: ΞΞ ΡΥ κοι 
(from the axioms) as the set consisting of the numbers 1,1 + 1,1 -Ἐ1 -Ὁ 


1,...; and then we could prove the principle of mathematical induction: 
If S is a subset of R such that 
(a) le S; 


(b) ifx ε Sthen(x+ 1) ε S, 
then S contains every positive integer. Then we could define the set of 


integers 
ΖΞ λυ 21,0, 52. as} 


and the set of rational numbers 
Ο -- {es m EZne zZ,} 


and diligently verify that 


ἼΩΝΕΣ i LT 

n q nq 
and so on. Perhaps (logically) we should carry out those deductions; how- 
ever, that would be time-consuming and it would be of virtually no help 
in understanding analysis. 

A similar comment is applicable to a few inequalities which can be 
deduced (easily) from the order axioms (B). If x < y, thenx +-z< y42Z; 
ifx < yand0O < c, then cx < cy. We use x < yto mean x < yorx = γ. 
It is understood that y > x means the same thing as x < y. The absolute 
value of a number x is defined by 


x, if x>0 
|x| = 
—x, if x<0 
and absolute value has these properties: 
|xy| = |x] ]y¥] 
jx+yl<|xj+ly| 
[x — y| > |x| —IyIl. 


These inequalities will be used with little or no comment. 

Now, one might reasonably ask this. If we are not going to deduce the 
various properties of the real number system from (A), (B), and (C), why 
do we bother to list just those particular properties and to assert that they 
determine the real number system? There are two principal reasons. 

First, analysis is based upon the concept of number, and so we are 
obligated to state clearly what the real number system is. One way do to 


Numbers and Geometry Chap. 1 


that is to state that the system is characterized by two theorems: (i) There 
exists a complete ordered field. (ii) Any two such fields are isomorphic; 
that is, there exists a 1: 1 correspondence between their members which 
preserves addition, multiplication, and order. The second reason for listing 
(A), (B), and (C) is that it will help us understand the completeness prop- 
erty (C). A fair fraction of introductory analysis consists of learning the 
meaning of the completeness of the real number system and learning to 
use various reformulations of it. 

As we have suggested, we shall not prove here that the real number 
system exists or that it is unique. What we assume is a familiarity with 
calculations in an ordered field. The one aspect of the number system with 
which we do not assume much familiarity is the completeness. In the next 
two sections we begin to look at some implications of completeness. 
Right now, let us try to be clear about what it says. 

Intuitively, property (C) is intended to say that if one thinks of real 
numbers as corresponding to points on a line, then the line has no holes in 
it. How can one subdivide the “line” R into the union of two non-empty 
sets, S and 7, such that every number in S is less than every number in 7? 
The only way to do that is to cut the line at some point, to let S be every- 
thing on one side of the cut and to let T be everything on the other side of 
the cut. Of course, the point where we cut must be put either in S or in T, 
and it will accordingly be the largest number in S or the smallest number in 
T. 

Precisely, suppose we choose any real number c. From c we obtain 
two slightly different subdivisions as described in (C): 


S=f{se R3s<c} 
T={te R3t>c} 


or 
S={se ΚΑ; 5 « οὗ 


T= {t © ΚΙ ΤΣ ct}. 


The completeness property states that there are no other examples, the 
first type being the one in which S has a largest member, the second type 
being the one in which T has a smallest member. 


EXAMPLE |. Let us look at the rational number system, which consists 
of Q (the set of rational numbers), together with the addition, multiplica- 
tion, and ordering inherited from R. Since sums, differences, products, 
and quotients of rational numbers are rational, we see that if we substitute 
Q for R in (A), the field axioms are satisfied. Similarly, Q satisfies the 
order axioms (B). Thus, the rational number system is an ordered field; 
however, it is not complete. Long ago, the Greeks noted that (loosely 
speaking) the set of rational numbers had holes in it—for instance, at 
the place where ,/2 ought to be. More precisely, they proved that there 
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is no rational number x such that x2 = 2. That means that we can cut the 
set of rationals at ./2 and show that Q is not complete, because the set 
S={s € O;5<.,/2} has no largest member and the set 7 = {t € Q; 
t > ./2} has no smallest member. Let us describe the situation without 


mentioning ./ 2. 
Suppose we define 


S = [5 € Q; either 5 -- O00or0<s and s? < 2} 
T = {t € Ο:0 < tand 7 > 2}. 


Clearly S and T are non-empty subsets of Q and each number in S 1s less 
than each number in 7. Here is the important point. Since there does not 
exist any x in QO with x? = 2, it follows that 


O=SUT. 


Does T have a smallest member? If tf < 7, then t? > 2. If ris a very small 
positive rational number, then we shall have (t — r)? > 2 (as wellast —r 
> 0); 1.6., we shall have (t — r) € T. Hence T has no smallest member. 
By similar reasoning, S has no largest member. Thus, the rational number 
system does not have the completeness property (C). 

Why can’t we give the same example in the real number system? Of 
course, the completeness property says that we cannot. But, let’s try it and 
see exactly what goes wrong. We define sets S and 7 as in (1.1), but replace 
Q by R. Again, we conclude that S and T are non-empty and that every 
number in S is less than every number in T. Again, we can show that S has 
no largest member and that T has no smallest member. The completeness 
property (C) leaves us with only one possibility, namely, that καὶ 4 S U T, 
i.e., that some real number belongs neither to S nor to 7. It is very easy to 
see that if x is a real number such that x ¢ S and x ¢ 7, then x? = 2. 
Thus, one of the things which completeness guarantees is that there exists 
in R a square root for the number 2. 


(1.1) 


Exercises 


In Exercises 1-10, deduce the stated properties of real numbers from the 
basic properties (A), (B), and (C). 


1 Ifx<yandz<w,thenx+z<y+w. 
. If x <0, then —x > 0. 
-Ifx+y=-x, theny = 0. 

. For each x in Καὶ, x0 = 0. 


.Ifx <yandy < z, then x < z. 


Nn nan Ὁ ὦ N 


. If xy = 0, then either x = 0 or y = 0. 
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7. (a) (—x)y = —(xy). Hint: [x + (—x)]y = ? 
(b) (—x)(—y) = xp. 

8. For each x in R, x? > 0. 

9. Ifx <y, thenx « ἐ(χ Ὁ »}) ς}). 

10. If x? + y? = 0, then x = y = 0. 


11. Use the completeness of the real number system to prove that each positive 
real number has a unique positive square root. 


12. The set of integers, with the addition, multiplication, and ordering inherited 
from R, is not a complete ordered field Precisely which of the conditions listed 
under the headings (A), (B), (C) are not satisfied ? 


1.2. Consequences of Completeness 


We shall discuss a few applications of the completeness of the real 
number system. First, we need some basic terminology. 


Definition. Let A be a set of real numbers, i.e., a subset of R. We say 
that A is bounded above if there exists anumber Ὁ © R such that 


ἃ Ξ: Ὁ, /forallacA. 


Any such Ὁ is called an upper bound for the set A. We say that A is bounded 
below if there exists a number c © R such that 


c<a, foralla © A. 


Any such c is called a \ower bound for the set A. We say that A is bounded if 
A is bounded above and bounded below. 


There are various simple observations we should make. The set A is 
bounded below if and only if the set —A = {—x; x ε A} is bounded 
above. If ὁ is an upper bound for — A, then —b is a lower bound for A. 
Such things are immediate from the fact that the condition x > y is equi- 
valent to —x < —y. The set A is bounded if and only if the set | A| = 
{|x |; x © A}is bounded above. If 5 is an upper bound for | A], then 


—b<x<Jb, x EA. 
On the other hand, if 
c<ix<d, ΧΕΑ͂ 
then the larger of |c| and |d | is an upper bound for the set | A |. 
EXAMPLE 2. The set of positive real members 
(1.2) R, = {x € R; x > 0} 


is an elementary example of a subset of R which is not bounded. It is 
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bounded below; in fact, any x < 0 is a lower bound for R,. But it is not 
bounded above. If b were an upper bound, we could deduce in order: 
b>0, (ὁ 1) ¢€R,,b>6+1,7? 


EXAMPLE 3. The set of positive integers Ζ., is bounded below. It is not 
bounded above. One of the properties of the real number system with which 
the reader is supposed to be familiar is the Archimedean ordering property, 
which states that if b € R, then there is a positive integer greater than ὃ. 
We shall call that a theorem (Theorem 2), and prove it as an exercise in 
the use of completeness. 


Theorem 1. Let A be a non-empty subset of R which is bounded above. 
Then A has a least (smallest) upper bound. 


Proof. Let T be the set of all upper bounds for A: 
T = {b € R; x <b forall x € A}. 
Let S be the complement of T 
S={xe¢ R;x € T}-. 
We can see easily that 


(i) R= SUT; 
(ii) ifs © Sandt ε 7, thens < ἢ. 


We defined S so that (1) would be true. What does (11) say ? It says, ifs 6 T 
and { ε 7, thens < t; or, if t © Tands >t, thens ¢€ 7. The last state- 
ment is clearly true. Look at the definition of T. 

The hypothesis that A is bounded above is precisely the statement that 
T is non-empty. The hypothesis that A is non-empty tells us that S is non- 
empty, as follows. Choose any x € A. Then S contains every number 
y < x, because, if y < x then y 1s not an upper bound for A. 

The completeness condition now tells us that either S has a largest 
member or 7 has a smallest member. But S does not have a largest mem- 
ber. Lets € S, thatis, lets ἐξ 7. Then s is not an upper bound for A. Con- 
sequently there exists a number a € A with a> 5. The number d= 
4(a + 5) satisfies 

Sid = α. 


Since d < a, we haved € S. Since s < d, we see that s is not the largest 
member of S. 
Therefore, T has a smallest (least) number in it. That is a number c 
such that 
(1.3) (i) cis an upper bound for the set A; 
(11) 1f 15 an upper bound for A, then c < ὁ. 


In other words, c is the least upper bound for A. 
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Evidently, there is a companion result which asserts that, if a subset 
A of R is non-empty and bounded below, it has a greatest lower bound. 
That is a number c such that 


(i) cis a lower bound for A; 


1.4 
oe (ii) if δ is a lower bound for A, then c > ὁ. 


Notation and Terminology. Let A be a non-empty subset of R. If A is 
bounded above, the least upper bound for A is also called the supremum of 


A and is denoted by 

sup A. 
If A is bounded below, the greatest lower bound for A is also called the 
infimum of A and is denoted by 

inf A. 

One might wonder why we introduce other names for least upper 

bound and greatest lower bound. One reason is that they occur so often 
that they must be abbreviated, and “lub” and “glib” leave a little to be 


desired. 
Theorem 1 is a reformulation of the completeness of the real number 


system. In Section 1.1, if one assumes Theorem 1 instead of property (C), 
then it is easy to prove (C) as a theorem. The two properties are only slight- 
ly different. Let’s use Theorem 1 to prove that the set of positive integers 


is not bounded. 


Theorem 2 (Archimedean Ordering Principle). If x is a real number, 
there exists a positive integer n such that x <n. 


Proof. Suppose Z, is bounded above. Let c = sup Z,. Since c is the 
least upper bound for Z,, ὁ — 1 is not an upper bound for Z,.. Therefore, 
there exists a positive integer m such thatc —l<n.Soc<a-+ 1. But 
that says that ς is not an upper bound for Z,. (ἢ) 


Corollary. If x > 0, there exists a positive integer n such that \/n < x. 


Corollary. If y — x => |, there is an integer n such thatx<in<y. 


Proof. According to Theorem 2, there exists an integer m such that 
x < μι. There are at most finitely many integers k such that x < k<m. 
(That follows from the principle of mathematical induction.) Let πὶ be the 
least of those integers. It is a simple matter to verify that x <n < γ. 


Corollary. If A is a bounded set of integers, then sup A and inf A are 
integers. 


Corollary. If x <y, there exists a rational number τ such thatx <r 
<y. 
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Proof. Choose a positive integer n such that n(y — x) > 1. Then find 
an integer m such that nx « m < ny. Letr = m/n. 


Theorem 3. Let x be a positive real number and let n be a positive inte- 
ger. There is precisely one positive real number y such that γα = x. 


Proof. Let us make a simple basic observation. If s > 0 and t > 0, 
then ¢ > s if and only if ¢” > 55. That follows from the fact that 


15 --- 55 = (t — 5) f(t, 5) 


where Κ{1,5) = 151 + t7 2s + --- + 1552 + σππτ Since f(t, 5) > 0 un- 
less s = t = 0, the numbers ¢” — 55 and t — s have the same sign. 
Obviously (then) we cannot have two distinct positive nth roots. The 
only problem is to prove that there exists at least one. 
Let 
A={y € R;y>Oand y"> x}. 


Then A is bounded below. Furthermore A is non-empty. In case x < 1, 
we have |" > xsothat | € A; and, incase x > | we have 


x” — x = x(x"! — 1) 
=> 0 


sothatx € A. Letc = inf A. Certainly c > 0, and the claim is that c? = x. 

First, we show that c* < x. Suppose c” > x. Then we can find a 
small positive number r such that (ὁ — r)" > x. (See following lemma.) By 
the definition of A,(c — r) € A. But, c — r<c and c is a lower bound 
for A, a contradiction. It must be that ο" < x. 

The fact that no lower bound for A is greater than c will imply that 
c” > x. Suppose ο" < x. We can find a small positive number r such that 
(c +r)" < x. Thus (c + γ)" < x < y"forall y € A, which yieldsc + r< 
y for all y € A. So, ὁ + ris a lower bound for A. But c + r > c; hence, 
something is wrong. We conclude that c” > x. 


Lemma. Let n be a positive integer. Let a, Ὁ, andc be real numbers such 
thata «Ξ ο" < b. There exists a number ὃ > 0 such thata <(c +r)" <b 
for every τ which satisfies |r| < 6. 


Proof. We have 
{5 — c" = (t — ΟΖ, c) 
where 
f(t, c) = 0 ber te + eee Hetem™ 2 +!” 
If we apply this with t = c + r, we obtain 
(ctr —c?=rf(e+r,c) 
and hence 
(ctr ~c"|<Irl| fle+r0)|. 
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Now 
If{e+r,e)(<s(el+ rp t + (el+ [rp e| + --- 
+ (le| + [rpleP? + [ε}5 1. 
If |r| < 1, then 
Lye ste) le) ae Pe ae (le |e) ee 
+ (Jel + leh? + leh 
= ΧΙ + lel, |e). 
Therefore, 
e+ ry Ξ ΟἽ ΞΕ 7.1 +lehle),  flrj<. 
Although it is not necessary, we shall rewrite this inequality in a more con- 
crete form: Since (1 + |c|) — |c| = 1, the definition of f tells us that 
f+ lel le) ΞΞ + lel —leP. 

So our inequality says 
(15. \e+try—eP<|r[[A+le)’—leP, ΔΠΡΙΞΕῚ. 

We are told that a < c” < b and we want to ensure that a < (ὁ + r)” 
< 6, provided |r| is small. Let s be the smaller of the two numbers c” — a 
and ὁ — c". Then, if 

(ὁ ἘΠ) --- elas 
we shall have a - (c + r)" < ὃ. Define 6 by 
o[(l + [ely — lel] = 5. 
From (1.5) we then have 
a<x(e+ry’ <b, provided |r| < ὃ. 


The reader may already be familiar with the conclusion of the last 
lemma—the nth power function is continuous. One should look at the 
proof anyway, since one cannot have too much experience in handling 
inequalities. | 

The unique y > 0 such that γ" = x is denoted either ¥/x or x”. 
Remember that x!” > 0. If 2 is even, there is another real number y such 
that γ" = x, namely, y = —x'”. If 15 odd, there is no other real nth root. 


Exercises 


1. Is the set of rational numbers bounded below? 

2. Give an example of a bounded set A such that sup A is in A but inf A is not 
in A. 

3. Find all non-empty bounded sets A such that sup 4A < inf A. 
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4. Is the empty set bounded above? Does it have a least upper bound? 


5. Every subset of a bounded set is bounded. Any set which contains an un- 
bounded set is unbounded. (Unbounded means not bounded.) 


6. If A is bounded above and B is bounded below, then the intersection 4  B 
is bounded. 


7. Prove that, if x is any real number, then 
x =sup{re Q3;r < x}. 


8. If x < y, there exists an irrational (not rational) number ¢ such that x < ¢ 
<y. 

9. Verify that every non-empty set of positive integers contains its infimum. 
10. Let A be a subset of R which has uncountably many points in it. Prove that 


there exists a non-empty set B < A such that sup Bis not in B. (Uncountable is 
defined in the Appendix.) 


11. Prove the completeness property (C) from Theorem 1. 


12. Prove that, if a subset S of R (with the inherited addition, multiplication, and 
ordering) is a complete ordered field, then S = R. 


*13. Let Rand S be complete ordered fields. Show that R and S are isomorphic, 
i.e., show that there is a 1:1 correspondence between the members of R and the 
members of S which preserves addition, multiplication, and order. 


1 


1.3. Intervals and Decimals 


This is a short section, in which we shall discuss the decimal represen- 
tations of real numbers. We shall not use these representations very much. 
The purpose of the section is twofold. It provides us with some concrete 
objects to which we can point and say, “There, if you will, are the real 
numbers.” More important, it will make us think about the relation of 
intervals to the completeness of the real number system. 


Definition. An interval is a set 1 < R such that 


(i) I contains at least two points; 
(ii) ifx<t<yandifx,y Εἰ, thent € I. 


There are four types of bounded intervals, to which we shall refer 
repeatedly: 


The open interval (a, δ) = {x © R;a<x - }} 

The closed interval [a,b] = {x Ε R;a<x<}b} 

The semi-closed interval (a, ὁ] = {x © R;a<x<b} 
The semi-closed interval [a, δ) = {x © R3a< x < bd}. 


It is understood that a, b are real numbers with a < ὁ. 
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There are five types of unbounded intervals, to which we shall refer 
occasionally : 
(a,co) = {x εε Rja< x} 
[a, co) = {x € Rsa< x} 
(—coo,b)= χε R3x< δ) 
(—oo, Ὁ] = [x € Rx <b} 
(— oo, oo) = RK. 


We have left for the exercises the proof that every interval is of one of 
the nine types listed. In the notations for unbounded intervals, there occur 
the symbols “— oo” and “oo”. There are no objects —oco or oo in the real 
number system; indeed, we have assigned no meaning whatever to “— oo” 
and “foo”. The “far left” and the “far right” have their uses, but we'll talk 
about that later. 

The decimal representation of a real number simply locates the num- 
ber in a nested sequence of intervals, the lengths of which go down by a 
factor of 10 each time. The Archimedean ordering property and mathe- 
matical induction locate each x € Rin the semi-closed interval [n, n + 1), 
defined by some integer n. That ἡ is the greatest integer in x: 


(1.6) n= sup {k ε Z;k < x}. 
Then (x — n) Ε [0, 1), and we shall confine our discussion of decimal 
representation to numbers in that interval. 

Consider a number x Ε [0, 1). To any such x will correspond a 
decimal expansion 

Xx ™~ 2,4,Q4; δι ες 

where the “digits” are integers a, between 0 and 9. The process by which 
we arrive at the expansion is assumed to be familiar. We subdivide [0, 1) 


into 10 intervals J,, of length τίσ. and we enumerate them by k = 0,..., 9: 
Tk k+1 Ν 
p= τ Tok = 9....»9. 


We locate the J, which contains x, and that k is the digit a, (see Figure 1). 
An alternative way of describing this first digit in the decimal representa- 
tion of x is ᾿ 


a, = sup {k Ξ Ζ; 75 <4, 


a,=3 


I3 
FIGURE 1 
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Next, we subdivide the interval J,, into 10 intervals, each of length 
1/102. The intervals are 


[a,10-1 + k10-2,a,10°° + (k + 1)10-2), k=0,...,9. 


The one of those intervals which contains x determines the digit a,. In 
other words, 


a, = sup {k € Z;a,10°! + k10°? < x}. 


We repeat the subdivision process and continue. What do we end up with 
as a description of the decimal representation? It 1s a recursive definition 
of the digits a,,da,,43;,.... 


(1) a, 15 the largest integer k such that k107! < x. 
(1) After a,,...,a,_, have been determined, a, is the largest integer 
k such that | 


a,10-! + a,10-2 + --- + 4,_,10°° 9 + k10" < x. 


What we have done is to place x successively in the semi-closed inter- 
vals J,,J,,J73, ... defined by 


J, = [a,10°! + a,10°2 + --- + a,10°", a,10°! + a,10-2 
4 -++ + (a, + 110”). 


The intervals J, are “nested” 


(1.7) 


S eet ea fee re 
and x belongs to the intersection of all the J,. In fact 
(1.8) (7, = 8}; 
that is, no other number belongs to every J,. Why? If a, δ € J,, then 
[α — b| < 10°". If y (as well as x) belongs to every J,, then 
ly—x|< 10”, Wa Sogo 


Hence y —x = 0. 

We know that, by the scheme just described, there is associated with 
each x Ε [0, 1) a sequence of digits a,,a,,a;,...; and, we know (1.8) 
that different x’s have different sequences of digits associated with them. 
So, it is legitimate to employ the shorthand 


x => -2,;Q7Qa3 a:/o.46 


What is really being abbreviated is 
(1.9) x= Sa,10~, 
n=1 


but we'll worry about that later. 
What interests us now is this. Is every sequence of digits a,, a,,a3,... 
the decimal representation of a number x ε [0, 1)? Obviously not, if we 
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follow the ground rules thus far. Take a, = 9 for every ἡ. The intersection 


ΓἽΠ — 107", 1) 


n 


is empty so that we cannot have x ~ .999... for any x in [0, 1]. We know 
how to fix that. Had we considered [0, 1] instead of [0, 1), then .999. .. 
would have arisen as the decimal representation of the number 1. But 
there are still other sequences of digits which do not occur in our process. 
Obviously, in the scheme we described, no decimal representation will 
arise which is ultimately all 9’s: 


(1.10) .4,Az.« «Ω,,999.... 


But, we know how to fix that also. We allow a very slight ambiguity in the 
decimal representation by agreeing that (1.10) represents the same num- 
ber as does | 


(1.11) 4,45. . (ay + 1)000. . . 


provided a,, ~ 9. 

The fuss about repeating 9’s is not, however, at the heart of the ques- 
tion of whether a sequence of digits a,,a,,... need represent any real 
number. The central problem is this. If someone hands us a sequence of 
digits .a,a,a,..., where will we find the real number which it represents? 
The digits give us a nested sequence of intervals 


“Υ ΞΥΝ fps 3 Gr  ΞΑν ρος ὰ 

defined by (1.7). The x we want is supposed to be (in) the intersection of 
that sequence of intervals. But the intersection may be empty, because of 
the repeated 9’s business. So, we must replace the semi-closed interval 

Js aa [5,, C,) 
by the closed interval 

J, = [5,5 Cr] 
The intersection of the sets J, will catch the right-hand end point if the 9’s 
repeat. What we want to assert is that 


Ni, = ts} 


where x € [0, I]. Since the length of J, is 10-", there cannot be more than 
one point in the intersection. It is the completeness of the real number sys- 
tem which guarantees that there exists at least one x tn the intersection. 


Theorem 4 (Nested Intervals Theorem). Let 


| ieee i Pee a ego ar 


be a nested sequence of (bounded) closed intervals in R. Then there exists a 
real number x which belongs to every I,. 
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Proof. Let I, = [8,, c,]. Then 
Oy = Dy Dy ot eG χα 


Let B be the set which consists of all the numbers b,,7” = 1, 2,3,.... 
Then B is non-empty, and B is bounded above because any c, is an upper 
bound for B. Let 
x = sup B. 
We claim that x € J, for every ἡ, i.e., that b, << x <.c, for every ἡ. 
Certainly b, < x, since x is an upper bound for B. As we remarked, every 
c, 1s an upper bound for B; hence x < c,,. 


With the nested intervals theorem our discussion of decimals is com- 
plete. We have a correspondence between the set of all numbers x Ε [0, 1] 
and the set of all sequences .a,a,a,;... of digits (a, Ε Z, -: α, < 9). 
The correspondence is 1: 1, except that the sequences (1.10) and (1.11) 
must be identified. | 


Exercises 


1. Prove that every interval in R is of one of the nine types which we listed. 
2. Let {/,} be any collection of intervals in R. Prove that the intersection 
(\ Ta 
is one of the following: ; 
(a) The empty set. 


(b) A set with precisely one member. 
(c) An interval. 


3. Let A be a bounded subset of R. What is the intersection of all closed inter- 
vals which contain A? 


4. Give an example of a nested sequence of open intervals for which the inter- 
section 15 empty. Give similar examples for which the intersection is a set with 
one member, an open interval, a closed interval, a semi-closed interval. 


5. What kind of a set can the intersection of a nested sequence of closed inter- 
vals be? 


6. Suppose you were working in the rational number system. Describe a nested 
sequence of closed intervals for which the intersection is empty. 


7. Use the nested intervals theorem to give a binary representation for each 
point in [0, 1]: 
X ~ .€;42Q3... 
where the digits a, are either Ὁ or 1. 


8. In Section 1.1, assume (A), (B), and the nested intervais theorem, and the 
Archimedean ordering property. Prove the completeness property (C). 
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1.4. Euclidean Space 


We presume that the reader knows something about Euclidean space 
of n-dimensions. If n is a positive integer, then 


R"=RX---XR 
is the set of all n-tuples of real numbers. The points in R” will sometimes 
be called vectors, and our standard notation for those points will be 
es χρυ Χο) 
Υ Ξϑι γὺν......}. 


and so on. The number x, is the ith (standard) coordinate of X. 

There is a natural (vector) addition on R", defined by adding the coor- 
dinates: 

ΧΡ Y= (x, Vii Se Pa: 
There is a product, called scalar multiplication, defined for vectors XY € R” 
and numbers c € R by 
OX (CX yc ΧΙ 

With this addition and scalar multiplication, R” is a vector space. This 


means that the vector addition satisfies conditions A(1)-A(4) of Section 1.1 
and that the scalar multiplication satisfies 


c(X¥+ Yy=cX+cY 
(b+c)X=bX+cX 
1X = X. 
The zero vector for addition is the origin 0 = (0,..., 0). 


If X and Y are vectors in R’, the (standard) inner product of X and Y 
is the number 


(1.12) CX, YD = Hr +t HD ee 


In many books, this is called the dot product and is denoted by X:- Y. 
Evidently, the inner product has these properties: 


(i) (ἃ, Y> = CY, XD; 
(1.13) (ii) <cCX + Y,Z> = ες(Χ, 24+ ¢Y, 2); 
(iii) <X, XD > 0; if (ἃ, XD = 0 then X = 0. 
If X € καὶ the length (norm) of X is 
|X| = <x, X>¥2, 
The distance from X to Y is | X¥ — Y|. In order to see that length and dis- 


tance have their expected properties, it is most convenient to verify 
Cauchy’s inequality: If x,,..., x, and y,,..., y, are real numbers, then 


(1.14) Gy, + + XV)? Ξ OT +s + x2)(yi + +--+ + γ2). 
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Lemma (Cauchy’s Inequality). If X and Y are vectors in R", then 
(1.15) CX, Y>| ΞΊΧΗΥΙ. 
Furthermore, equality holds if and only if one of the two vectors is a scalar 
multiple of the other. 

Proof. If Y = 0, the inequality is a trivial equality. If Y ~ 0, use the 
fact that 

0< <X¥ —cY,X —cY) 
= |X|? — 2c<cX, Y> + c?| Y |? 
and apply it with 
{X, YD. 
|¥/? 

The result is Cauchy’s inequality. If equality holds, then ¥ = ΕΥ̓͂. 


c= 


Length has these properties: 


(i) |X| > 0; if |X| = Othen XY = 0; 
(ii) [cX| = |e| |X]; 
(11) |X + ΥἹ ΞΞ [Χ] -ΕἸΣΙ. 
The triangle inequality (iii) follows from Cauchy’s inequality, because 
[X+ Y|? = |X? + 2(Χ, YO4+ | YP 
|X 2 | ΕΣ 
= (X|+ | Y)?. 

Let us say a brief word about geometry. There is a fuller discussion in 
Section 1.6. Normally, when we discuss a “vector” X in R”, we are think- 
ing of the line segment from the origin to X, rather than the point X. If 
we have two vectors, X and Y, and if neither is a scalar multiple of the 
other, then those two vectors span a plane in Ἀπ". That plane passes through 
the origin, and it consists of all vectors aX + bY with a, b € R. The vec- 
tors X and Y are two of the edges of a parallelogram in that plane. One 
diagonal of that parallelogram extends from the origin to the point X + 
Y= (x, + y1,.--,X, + y,), aS in Figure 2. Suppose we let 8,0 < @ < z, 
be the angle between the vectors ¥ and Y. Then @ measures the extent to 
which Cauchy’s inequality fails to be an equality: 


(1.16) (Χ, YO =|X|| YY] cos 8. 


It is particularly easy to verify (1.16) after one knows something about 
orthogonal bases, because the use of such bases shows that (1.16) need only 
be verified in R2. See also Exercises 3 and 4 of Section 1.5. 


EXAMPLE 4. Let us look at an application of Cauchy’s inequality to 
matrices. For simplicity, we'll talk only about square matrices. A k x k 
matrix with real entries is (represented as) a square array of real numbers 
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FIGURE 2 


having k rows and k columns: 
A = [q;,,], lei k, bags. 
We add matrices by adding the corresponding entries 
A+ B= [a,; + 5,)] 
and similarly 
cA = [ca,,]. 


Thus, the space of k x k matrices is just R*’, with the k? coordinates listed 
in k rows and k columns. The norm of the matrix A is given by 


| A)? = 2) aij. 
i, j 


We shall denote the space of k x k matrices by R***, to remind us of 
the arrangement of the k? coordinates into rows and columns. This 
arrangement is pertinent in problems which involve matrix multiplication. 
Matrix product is defined by C = AB, where 


ἐμ; = Dy αμ,)- 
r 


It is associative: (AB)C = A(BC), and it distributes over addition: A(B+ C) 
— AB+ AC,(A + B)C = AC + BC. It is not commutative unless k = 1, 
that is, generally AB +~ BA. What really interests us is the relation of 
norm to matrix product, | AB|<|A||B|. 


Theorem 5. If A and B are k X k matrices, then 
|AB| <|A]||Bl. 
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Proof. 
| AB|? = » | a;,b,; |? 
ΦΣΣαΣ Σ᾿ δῦ; (Cauchy). 
Now | 
[4|} = do a7, 
|BP = Σὺ δὲ; 


and so it is apparent that 
[4812 <|A)?| BP. 


We might remark that matrix multiplication can be used to express 
inner products on the space of matrices in the following way 


(1.17) <A, BY = trace (AB’). 


The trace of a matrix is the sum of its diagonal entries. The matrix B’ is the 
transpose of B. Its i, j entry is δ... In the case B = A, (1.17) says 


| A |? = trace (4.42. 


1.5. Complex Numbers 


The complex number system is (essentially) obtained by adjoining to 
the real number system a square root for the number —1. The enlarged 
“system” has (in one sense) less structure, because the ordering of the real 
numbers does not extend to an ordering of the complex numbers. But, the 
complex system is richer in ways which make it indispensable for under- 
standing parts of mathematics. For instance, we obtain complex numbers 
by introducing a zero for the polynomial x? + 1; but it turns out that every 
non-constant polynomial with real (or even complex) coefficients has a 
zero in the set of complex numbers. That is the so-called fundamental 
theorem of algebra, which we shall prove later. | 

We mildly caution the reader on two points: (1) This section is brief; 
however, the importance of complex numbers in this book should not be 
underestimated. (2) Mathematicians have retained the mystical termi- 
nology of “complex” and “real” and “imaginary” numbers; however, the 
terms are not now intended to suggest anything about reality or the ab- 
sence thereof. 

We list some properties which characterize the complex number sys- 
tem. Let C be the set of complex numbers. 


1. C is a field; i.e., there is an addition and a multiplication on C 
which satisfy the field axioms (A) of Section 1.1. 
2. C contains R as a subfield. 
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3. T here is an element i € C such that i2 = —1. 
4. If a subfield of C contains R and i, then that subfield is (all of) C. 


The subset S is a subfield if S contains 0 and 1 and is closed under the 
formation of sums, differences, products, and quotients. Thus, (2) says that 
R sits in C and that when we restrict to R the addition and multiplication 
of C, they become the addition and multiplication of R. If S is a subfield 
which contains R and i, then S contains every element of C of the form 
x + iy, with x and yin R. On the other hand, one can show that the set of 
those numbers is a subfield; hence, by (4) it exhausts C. 

Therefore C consists of the numbers 


(1.18) : z=x-+I, x,yER 
added and multiplied according to the usual rules of algebra (the field 
axioms), with i2 = —1. The representation (1.18) of the number z is 
unique. We call x the real part of z: 

x = Re(z) 
and we call y the imaginary part of z: 

y =Im(). 


Notice that the imaginary part is real (a real number). The (complex) con- 
Jugate of x is the number 


(1.19) z* =x — iy. 

Note that (z + w)* = z* + w* and (zw)* = z*w*. Since 
(1.20) zz* = x7+ y? >0 

it makes sense to define the absolute value of z 

(1.21) [2] ΞΞ ΖΕ M2. 


In connection with (1.20), we might remark that if w € C and we write 
w > 0, that is understood to mean that w 15 real and w > 0. 
It is a straightforward matter to verify that 


[Ζ + wl <|z|+|w| 


1.22 
as |zw| = |z||w|. 


Since absolute value preserves products, the number z/|z| has absolute 
value 1 Gif z 40). Hence each non-zero complex number z is uniquely 
expressible in the form 


(1.23) z=rw, r>0, |wi/= 1. 


There is a 1: 1 correspondence between complex numbers and points 
in R2 which is so immediate that we often identify C and ΑΖ as sets. The 
number z = x + iy corresponds to the point (x, y) in R?. Addition of com- 
plex numbers corresponds to the vector addition in R?; briefly, we add 
complex numbers by adding their real and imaginary parts: 
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Re (z + w) = Re (2) + Re (w) 
Im (z + w) = Im (2) + Im (w). 


Thus, we have the parallelogram picture of addition in C. Absolute value 
corresponds to length in R?: 


Jz] = (Ὁ + yan, 
Conjugation corresponds to reflection about the real line. 


The geometric interpretation of multiplication involves angles. Let 
us look at complex numbers w of absolute value 1: 


w=—u-+ilv 
| u*+ vy? = |. 
These points (uw, v) comprise the circle of radius 1 centered at the origin— 
usually called the unit circle. Each w on the unit circle is uniquely located 
by the angle θ from the vector 1 to the vector w. (See Figure 3.) Further- 


more, 
u = cos @ 


v= sing 
because that is the usual definition of cos θ and sin 9. Thus 
(1.24) w = cos @ + isin @. 


If angles are measured by numbers, i.e., if 9 “is” a number in our discus- 
sion, then (1.24) determines a unique 0,0 < @ < 2z. Any number @ + 
2Κπ, k ε Z, would then serve as well. 


FIGURE 3 


2l 
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Now 
cos @ + isin@ = e®. 


(We’ll discuss that carefully later.) It is then clear that each non-zero z € 
C can be expressed 

(1.25) | z= ye". r>0, OER. 

Of course r = |z|. There are many such 6’s, all differing by integer multi- 
ples of 27. Each @ is an argument of z. If we also have 


w= pe", p>0, te R 
then 
zw = (rp)et**®, 


In other words, multiplication of complex numbers multiplies the absolute 
values and adds the arguments. If the same fact is verified without using 
the exponential function, it amounts to the verification of the trigonometric 
identities 
(cos θΘ + isin 9)(cos ¢t + isint) = cos (θ + t) + isin(@ + ἢ 
that 15, 
cos (9 + t) = cos cos t — sin@ sin ἢ 
sin (θ + ft) = sin @ cost + cos @ sin t. 
Each non-zero complex number z has ἢ distinct nth roots. Write 
Zi Ze. 0<@0< 22 
and let « = θη. Then the numbers 
ge 5:5 te ime 
2π ΝΕ 
GO, --α + Κι΄. k=0,...,n—1 
are the nth roots: 
ΤΗΝΕ ΤΟΝ k=0,...,n—1. 


On a few occasions, we shall want to discuss complex n-space, C”. This 
is the set of all m-tuples of complex numbers (z,,..., Z,). Vector addition 
and scalar multiplication are defined formally as in Κα, replacing R by C. 
The standard inner product on C” is 

(Z,W>) = z,wt + --- + z,w*. 
It has the properties (1.13), except that 
ΦΉΣ Ζ = <Z. W>*. 
Thus, Cauchy’s inequality relating inner product and length is valid, with 
essentially the same proof. Apply the inequalityO < <Z + cW,Z + cW) 
with 
<Z, W> 


WW 
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In terms of coordinates, Cauchy’s inequality states that 
aw, eee ΕΝ | ey Pe a ee we a ew) 
There is a (real vector space) isomorphism between C" and R2". If z; = 
x, + ἵν,» then 
(Ζ:: mee 2) Ae, Vio X29 Va2> ee, Xn» Vn) 
maps C” onto R2". It preserves sums, multiplication by real scalars, and 
length: 
Zit vee + ΖΗ = xd yt + oe xd + yt. 


EXAMPLE 5. The discussion of matrices in Example 4 can be extended 
immediately to the complex case. The space of k Χ k matrices with com- 
plex entries behaves like complex k?-space. In this case, the inner product 
is described this way: 

<A, B> = trace (AB*) 
where B* is the conjugate transpose of B. Its i, j entry is δὲ. The verification 
that 
| AB| <|A||B| 
is as in Theorem 5, since we have Cauchy’s inequality for complex num- 


bers. We shall (of course) denote the space of complex k Χ k matrices by 
Cle, 


Exercises 


1. If z is a complex number and |1 — z| =|1 + 2| = 1, then z = 0. 
2. If |z| < 1 and |w| = 1, then 


w+zZ 
1+ z*w 


3. If we identify C with R?, then the inner product of z and w is 
<z, W> = Re (zw*). 


= 1, 


4. If we identify C with R2, 
<z,w> =|z||w| cos a 
where & is the angle from z to w. 
5. Let 0 <r <1. Then 


wor 
sup 5. ἘΠ 


6. Each complex k x k matrix A is uniquely expressible in the form A = 
A, + iA, where the matrices A; have real entries. Is it true that 


[412 = ]A, |? + |A2|?? 


1—r 


we Clw|=ip= pet. 
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7. Each complex k x k matrix A is uniquely expressible in the form A = 4, 
Ἢ iA, where A; = Aj (the conjugate transpose). Is it true that 


| A |? -- |A; |? a | A2|?? 


1.6. Linear Geometry 


We shall review a few basic facts from linear analytic geometry. We 
hope that the reader is acquainted with this material. It is not essential for 
reading this book; however, we shall refer to this material often in the 
examples and exercises. 

Suppose that X and Y are distinct points in R”. The (straight) line 
through X and Y is 


{¢xX¥+U—aAY;t € R}. 

A subset S of R’ is called flat if it has this property: If X and Y are in 
S, then the line through X and Y is contained in S. A (linear) subspace of 
R* is a flat subset which contains the origin. 

A subspace is more commonly defined as a non-empty subset (δ᾽ such 
that 

(i) if X and Y are in S, then (Χ + Y) isin S; 

(ii) 1f X is in S and c is any real number, then cX is in S. 


In this formulation, one may replace (i) and (ii) by the single condition: 
If Yand Yarein S,thencX¥ + Yisin Sforallc € R. This is equivalent to 
the flat subset characterization which we used as the definition. The essen- 
tial point is that if X,,..., X, are vectors in the subspace δ, then every 


linear combination 
CX, +--+ + e,X, 


of those vectors is in S. If we start with any vectors X,,..., X, in Κα, the 
set of all linear combinations of those vectors is a subspace, called the 
subspace spanned by X,,..., X;. 

Note that a line is a flat subset and consequently is a subspace if and 
only if it passes through the origin. Planes through the origin will be 
(defined as) the 2-dimensional subspaces. Let us say something about di- 
mension. 

The vectors V,,..., Κκ are linearly dependent if it is possible to ex- 
press the O vector as a linear combination of them in some non-trivial way 


cV,+---+¢V, =9, c, ~ Ὁ for at least one /. 


Linearly independent means not linearly dependent. If V,,...,V, are 
linearly independent, then V, τέ 0 and also V, #4 V, when i $ ). 

The most basic fact about linear dependence is this. Jf V,,...,V, 
are vectors in R" and if k <n, then V,,...,V, are linearly dependent. In 
short, any n + 1 vectors in R” are linearly dependent. That is a reformula- 
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tion of the basic theorem on systems of linear equations. For, suppose we 
wish to find numbers c,,...,c, such that c,V, +--- +c¢,V, =0. If 


Ve = (κου κα ¥;,); l<i<ik 
then the condition on the numbers c¢, is 
οι “hts 1 οὕ = O 


CVn Fees + CY. Ξ 


CiVin tees t,v,, = 0. 
If k > n, this homogeneous system of n linear equations in k unknowns 
has a solution for (c,, .. . , c,) which is non-trivial (not every c, = 0). 

If S is a subspace of R”, the dimension of S is the maximum number of 
linearly independent vectors which can be found in S. More precisely, 
dim S is the largest non-negative integer k such that some k-tuple of vec- 
tors in S is linearly independent. Of course dim S < π. It is not difficult to 
see that dim S < n, except for the one subspace S = R’. 

An (ordered) basis for a subspace S is a k-tuple of vectors V,,...,V, 
such that 


(i) V;,..., V;, are linearly independent; 
(1) S is the subspace spanned by V,,..., V;. 


The simplest example of a basis is the standard basis for R’ 
E, = (0, 0, 0,..., 0) 


E, = (0,1,0,...,0) 
(1.26) 


E, = (0,0,0,..., 1). 


The unique expression for X¥ = (x,,...,x,) as a linear combination of 
those vectors is 
A=MX,E, -- See wh be 


Theorem 6. If S is a subspace of R®, then S has a basis, and every basis 
for S consists of precisely dim S vectors. 


Proof. If d = dim S, then S has a basis consisting of d vectors: We 
can find vectors Y,,..., Y, in S which are independent. By the definition 
of d, we know that, for any X € S, the vectors Y,,..., Y,, X are depen- 
dent. But that means that X is a linear combination of the vectors Y,. 

Suppose V,,..., V, is any (ordered) basis for the subspace S. Each 
X in S can be expressed as a linear combination 


(1.27) X=c¢,V,+--: τῇ ἀκ. 
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Furthermore, the numbers c,,..., ἐκ are uniquely determined by ¥ (and 
the basis). If we had another expression for X as a linear combination, we 
could subtract and contradict the independence of V,,...,V,. The k- 
tuple (c,,...,c,) is the k-tuple of coordinates of X relative to the ordered 
basis V,,...,V,. It is clear that (1.27) defines a 1: 1 correspondence 
between the set of all vectors X in S and the set of all k-tuples (c,,..., c;) 
in R*. If we add two vectors X and Y in S, the corresponding coordinates 
add; and, multiplication of X by the number ¢ multiplies each coordinate 
c; by t. Thus, as far as linear operations are concerned, S behaves just 
like R* (S is isomorphic to R*). In particular, any k - 1 vectors in S are 
linearly dependent. If d = dim S, there exist d independent vectors in S. 
Thus d< k. But k < d by the definition of d. 


Suppose V,,..., V,,is a basis for R”. Then we can describe each Χ in 
R’ by its coordinates relative to that basis, as well as by its standard coor- 
dinates. The standard coordinates are the coordinates relative to the basis 


E,,..., £, (1.26). How do we get from one set of coordinates to the other? 
If 


(1.28) X = (X1,...,%,) =CV, +--+ eV, 
then 

Hy = σα, ees Ῥ οὐ,» "ἘΞ γε ἢ. 
That says that 
(1.29) X=CQ 


where X = (x%,,...,x,) and C= (c,,...,c,) are 1 X n matrices and 
where Q is then Χ n matrix 


Vi1 tt U1, 


Vnt cee QV 


nn 


Since V,,...,V,, is a basis, each vector E, is a linear combination of 
ere Ae 


£ PAV ee Dial as Ια τ 
The n Χ n matrix P = [P,,] then has the property that 


(1.30) C= XP 
for each X in Κα, The matrices P and Q are (easily seen to be) related by 
(1.31) PO= OP=—] 


where Jis then Χ nidentity matrix: J;; = 6,;. Thus, the matrix Q is invert- 
ible and P = Q"! (similarly Ο = P™'). 
If we begin with n vectors V,,..., V, in R", they form a basis for R” 
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if and only if the matrix Q =: [v,,] is invertible. In turn, that happens if and 
only if the determinant of Q 1s different from zero. 

The simplest bases are orthonormal bases, and we should say some- 
thing about them. The vectors X, Y in R” are called orthogonal (perpen- 
dicular) if <X, Y> = 0. The Pythagorean property is immediate from the 
definition: If X and Y are orthogonal, then 

Pee Ppa? See 


The k-tuple of vectors V,,..., V, is called orthogonal if Κ᾽, is ortho- 
gonal to V, for i 4 j. The k-tuple is called orthonormal if it is orthogonal 
and each V, has length 1. Thus, orthonormality means 


(1.32) CVV > = Oy. 
Suppose we have an orthonormal k-tuple V,,..., V,. If X is a linear 
combination of those vectors 
X=cV,+°:::- +¢,V, 
it is easy to compute from (1.32) that 
(1.33) Cee XV: Lok. 


In particular, if X = 0, then c,; = 0 for all i. Therefore, orthonormality 
implies linear independence. 
Suppose X is any vector in πὶ Let 


Y= (ΧΙ ΨΣΥ͂, + + +X, ViVi 
Z= xX — Y. 
By the remark in the last paragraph 
CX, VD = <Y, VD. 


Hence Ζ is orthogonal to every V;. Of course, it may be that Z = 0; but, 
clearly Z = 0 if and only if X is in the subspace spanned by V,,..., ἔκ. 


(1.34) 


Theorem 7. Every subspace of R® has an orthonormal basis. 


Proof. If V,,..., V, is an orthonormal k-tuple of vectors in S, how 
large can k be? Certainly k < dim S, since orthonormality implies inde- 
pendence. Look at the largest possible k. Then every vector X in S must 
be a linear combination of V,,..., V,. Otherwise, (1.34) would yield a 
vector Z in S so that the (kK + 1)-tuple V,,...,V,, Z/|Z| was ortho- 
normal. 


Suppose V,,..., V,, is an orthonormal n-tuple of vectors in R”. Then 
they constitute an (orthonormal) basis for R". The coordinates of a vector 
X relative to the basis V,,..., V,, are easy to compute as in (1.33): 


X=¢,V,+-:- + c,V, 


(1.35) we. 
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This is a great simplification over the situation for a general basis, where 
one needs to invert the matrix Q = [v,,] before many coordinates can be 
computed. In the orthonormal case, the inverse of Q is the transpose 
matrix Οἱ = [v,,]. The conditions <V,, V;> = δι; say precisely that 
(1.36) OO =I] 
Such a matrix Q is called an orthogonal matrix. 

The vectors V,,..., V,: 

Vi = ns - ον» Vin) 


form an orthonormal basis for R” if and only if the matrix Q = [v,,] is an 
orthogonal matrix. If Q is orthogonal, then the coordinates C = (c,,..., ¢,) 
of the vector X relative to the orthonormal basis V,,..., V, are given by 
ΞΟ: 
or 
=< GYD: 


If one wishes to study a subspace S, it is often most convenient to use 
the orthogonal complement of S: 


(1.37) — St= {Xe R3<¢X, Y=Oall Ye S}. 


Theorem 8. Let S be a subspace of Ἀ". Each vector X in R® is uniquely 
expressible in the form , 
(1.38) X=Y+Z, Yes, ZeS. 
Proof. Suppose we have X decomposed as in (1.38). Let V,,...,V, 
be any orthonormal basis for S. Since Y is in S$ 
Y = (Υ͂, ΟΣ, +--+ (Υ, ViVi. . 
Since Z is orthogonal to each V,, we have < Y, ΚΣ = <X, V;>. Hence, 
Y= (ΧΙ ΡΣ, + ee) +X, VV, 
(1.39) CX, VV: «Χ ΚΟ, 
Z= X— Y. 


That determines Y and Z uniquely. On the other hand, given X, we can 
define Y and Z by those formulas and clearly Y is in S and Z is in S_. 


If X is in R”, the vector Y in Theorem 8 is called the orthogonal pro- 
jection of X on S. Even in the proof of the theorem we used a particular 
orthonormal basis to define Y (1.39); however, as the proof shows, Y is 
independent of the basis. Notice how norms and inner products behave 
relative to orthogonal projections. Suppose Y, is the orthogonal projec- 
tion of X, on S and Y, is the orthogonal projection of X,: 


y= Y,+2, 
X,= Y,+ 23. 
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xX, ΧΩ ἘΞ Vis Y,»> ae {Ζ,, Ζ,) 
|X; ? = [YP + |Z;\?. 
We have described subspaces by bases. There is a dual point of view in 
which we describe a subspace by giving a set of equations for it. The equa- 


tions are linear; that is, they are of the form f/(X) = 0 where / is a linear 
function | 


RoR 
f(eX + ¥) = of(X) +f). 


If fis such a (real-valued) linear function on R’, there are numbers a,,.. . 
a, such that f has the form 


ἈΧῚ = aX, oF yi zo A,X n- 
Suppose V,,..., V, is an ordered basis for R*. There is an associated 


k-tuple of coordinate functions /,,...,/f,. The function f, assigns to the 
vector X its ith coordinate relative to the basis V,,..., V,. In short 


X=f (XV, + --- + ACV, 
Certainly /,,...,/, are linear functions. Their specific form is 
F(X) = Dipti + +++ + DayXn 
where P = [p,,] is the matrix of (1.30): 
E, = PaVi +++ + DV a 


The functions f,, ...,/, are uniquely determined by the fact that they are 
linear and that 


Μὴ a 01): 
Theorem 9. IfS is a k-dimensional subspace of R® then Ὁ is the solution 
space for a system of (n — k) homogeneous linear equations. 


Proof. LetV,,..., V, bea basis for S. We can find vectors V;,,,,..., 
V,, such that V,,...,V, is a basis for R". Consider the corresponding 
coordinate functions f;,...,f,. Obviously S consists of all vectors X in 
R* such that 


f(X)=0, i=k+1,...,n. 
Notice that in the proof we could use orthonormal bases. The con- 
clusion is that there are (n — k) vectors V,,,, V, in R” such that 
S= {X¥;<(X,V)>=0,i=k+1,..., nh}. 
A hyperplane in R” is a level set of a non-zero linear function f: 
H = {X;f(X) = ¢} 
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A hyperplane is a flat subset of dimension (n — 1). For, if we are given 
such an H and if we choose any Χο such that f(X,) = c, then f(X — Xo) 
= 0 for every X in H. Thus X is in H if and only if 


The set of vectors which f sends into 0 is an (n — 1)-dimensional sub- 
space. Thus, every hyperplane is a subspace of dimension (n —- 1), trans- 
lated by a fixed vector in R". Similarly, we see that each non-empty flat 
subset of R” is a subspace, translated by a fixed vector. Theorem 9 says 
that every k-dimensional subspace of R’ is the intersection of n — k hyper- 
planes (through the origin). Thus, we see that every non-empty flat subset 
F is an intersection of hyperplanes; in fact, F can be described by specify- 
ing (not more than n) linear conditions on the coordinates of the vectors 
in F. 


2. Convergence 


and Compactness 


2.1. Convergent Sequences 


Analysis is founded upon the concept of limit; indeed, the central 
role played by limiting operations is essentially what defines the branch of 
mathematics known as analysis. We shall first take up the idea of the 
limit of a sequence. It is the simplest type of limit, and yet a thorough 
comprehension of it enables one to understand other limits rather easily. 

We shall be working in R”, Euclidean space of dimension m. The 
first thing we need is some language for describing when points are near 
to one another. 


Definition. If X, © Ἀπ andr > 0, the open ball of radius r about the 
point Χο is 


(2.1) B(X,; τ) = {X;|X — ΧΟ} «τῇ. 
The closed ball of radius r about the point Χο is 
(2.2) B(Xo; ἢ) = {X;|X — Χο - τῇ. 


A neighborhood of X, is a subset (of R™) which contains an open ball 
about X,. 


A neighborhood of X contains every point which is (sufficiently) 
close to X. The most important neighborhoods of X are the open balls 
about X. Notice that the concept of open [closed] ball reverts to open 
[closed] disk in R? and (symmetric) open [closed] interval in R!. 
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Definition. The sequence {X,} converges to the point X if every neigh- 
borhood of X contains X,, except for a finite number of values of n. If {X,} 
converges to X, then we write 


(2.3) X = lim X,. 


Suppose that X = lim X,. If r > 0, the open ball Β(Χ; γ) contains 
X,, except for certain integers 1,,...,”,, which presumably depend on r. 
Let Ν, be the maximum of those numbers, and then N, is a positive integer 
such that 


|x — X,| <r, n> N,. 


Since every neighborhood of X contains some B(X;r), we see that X, 
converges to X if and only if, for each positive number r, there exists a posi- 
tive integer N, such that 


(2.4) IX—X,|\<r, n>N,. 


The terminology concerning convergence is frequently bent in 
various ways. Often, we omit the braces and say “X, converges to X¥”. We 
say that the sequence {X,} converges (or, is convergent; or, has a limit) if 
there exists an X such that {X,} converges to X. If no such X exists, we 
say that the sequence diverges. If {X,} converges to X, we sometimes call 
X the limit of the sequence and we sometimes write 


X= limX, 
or 
X = lim X, 


nic 


instead of (2.3). 

Be careful about trying to bend the wording of the definition of con- 
vergent sequence. There is a distinct difference between “contains X,, 
except for a finite number of n’s” and “contains all except a finite number 
of X,’s”. In R', the sequence X, = (—1)" would converge to every x € R, 
if we used the second wording. Every open interval contains all except a 
finite number of x,’s, because there are only two x,’s. On the other 
hand, no interval of length less than 2 has the property that it contains x,, 
except for a finite number of values of n. Hence, this sequence {x,} 
does not converge. 


Lemma. If 
X=limX, and Y=limyY, 
then 


(1) for every real number c, 
cX + Y = lim (cX, + Y,) 
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(ii) ¢X, Y) = lim <X,, Y,>. 


Proof. (i) Part (i) is trivial when c = 0. Suppose c + 0, and let r be a 
positive number. Since 


Hex Ἐ 1) ΧΟ = 61} AQ = ΣΩ] 
the distance from cX, + Y, to cX + Y will be less than r provided | 


; 
|\Y—Y,|< > 
r 
ΙΧ so X,,| «- 2|ε] 
Since {X,} converges to X, there is a positive integer M such that 


r 
PS bree n> M. 


Similarly, since {Y,} converges to Y, there is a positive integer N such that 
IY—Y,J<4, ΣΝ. 


Let Καὶ = max (M, N). Then we shall have 
[(cX + Υ) -- (οΟΧ, + YJ | <r, n>k. 
(ii) There is a standard sort of manipulation for this type of argument: 
(Χ, Υδ — {Xp ΥἹΣ Ξε «Χ,Υ - YD 4+ «Χ -- X,Y). 
Thus by Cauchy’s inequality 
|XX, Y — (Χ,, Y>|< IXY — Y,1+1¥, 11 — Xl. 
Since Y = lim Y,, we have 
᾿ |Y—yY,|< 1 
and thus 
[Y,[<14+|Y| 


except for a finite number of n’s. If δ is the larger of | ¥| and 1 + | Y|, we 
have 


except for finitely many n. The remainder of the proof is like the proof of 
part (i). 
Three special cases of this lemma should be noticed. If 
X = lim X, 
Y= lim X, 


then 
Υ — X = lim (XY, — X,) = 0. 
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Hence, a sequence has at most one limit. In part (11), if X, = Y,, we have 
|X|? = lim |X, |?. 
It is then easy to show that we can remove the exponents 2; but, note that 
it is trivial by another means to see that 
|X| = lim |X, | 
if X, converges to X, because 
19 Rae 0s Gee, Ὁ, 


In R!, part (ii) states that x,y, converges to xy if x, converges to x and y, 
converges to y. 


Theorem 1. The sequence {X,} in ἈΠ: 
Xa are (Xai, fey Xnm) 


converges if and only if each of the m coordinate sequences {x,,} converges. 


If 


xj = lim X,j, l<j<m 


then X,, converges to X = (X1,..., Χμ). 


Proof. Suppose that {X,} converges to a vector ¥ = (X%1,..-, Xn): 
Let j be an index, 1 - 7 < m. Since 


[χ — ΧΙ ΞΕ ΙΧ, — ΧΙ 


and lim| Χ΄, — X| = 0, the sequence {x,,} converges to x,. 
Now, suppose that for each j, 1 <j < μι, the coordinate sequence 
{x,;+ 15 known to converge. Let x, = lim x,,, and define Y¥ = (x,,..., X,). 


Observe that 
ΙΧ — X,|<|x, — χα + | x2 — X22 | Ἐπ Ὲ Ἔχ, το ΧριαΪ: 
Each sequence {|x, — x,,;|} converges to 0; hence, the previous lemma 


tells us that their sum converges to 0 also. We conclude that lim| X — X,,| 
= 0, 1.e., X¥ = lim X,,. 


EXAMPLE 1. Convergence of sequences {z,} of complex numbers is 
related to Theorem 1. Of course, we say that z, converges to z if |z, — z| 
converges to 0. There is a natural correspondence between C and R?, 
whereby z = x + iy corresponds to the point (x, y). Theorem 1 tells us 
that z = lim z, if and only if Re (z,) converges to Re (z) and Jm (z,) con- 


verges to Im (z). Analogous remarks apply to the space C*. 
EXAMPLE 2. We shall refer from time to time to convergence of 


sequences of matrices. Suppose we talk about convergence of sequences 
of k Χ k matrices with real [complex] entries. We are then operating in 
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the space R*** [C***], Euclidean space with the k? coordinates arranged 
in k rows and k columns. By Theorem 1, that is equivalent to convergence 
of the corresponding entries. Note that, if A, converges to A and B, con- 
verges to B, then the matrix product A,B, converges to AB. The proof is 
as in part (11) of the last lemma: 


We have used here the complex version of Theorem 5 of Chapter 1: 
|ST |< |S||T|. 


EXAMPLE 3. In many problems, we are given a sequence {X,} and we 
are interested in the convergence of the successive sums 


(2.5) S, = αὶ i nee = ae Χο 
We then speak of the infinite series 
(2.6) LX, 


and we call S,, S,,... (2.5) the partial sums of the series. We say that the 
series converges if the sequence {S,} converges; and if 


S = lim S, 
we call S the sum of the series, and we write 


(2.7) S= UX, or S= 2 x, 
The symbolism (2.6) for the series is a little sloppy, but very convenient. 
When we say that >) X, “is” an infinite series, we mean that {YX,} is a 


sequence and what interests us is the question of the convergence of the 
partial sums S,. But, if S is a vector and we write (2.7), that means that 
the series converges and S is the sum. 
In Section 1.3, we associated with each number x ε [0, 1] a decimal 
representation 
X ~ .4,4,Q3.... 
In series notation that is now seen to be: 


x = Σ a,107". 
If z is a complex number and | z| < 1, then the series 
Σ 2" 
n=0 
converges, and its sum is (1 — z)~!. For this series, 


S,=l+z+---4+2=(0— 25) —z). 
Since |z| < 1, lim z* = 0. Thus 1 — z"*! converges to 1 and S, con- 
k 
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verges to (1 — z)~!. Symbolically, 


] Sas ᾿ 
fo Σ 5. [Ζ].Ξ 1. 


Exercises 


1. Let X be a point of R”. What is the union of all neighborhoods of X? What 
is the intersection of all neighborhoods of X? Is that intersection a neighbor- 
hood of X? 


2. Let S be a bounded non-empty set of real numbers. Prove that there exists a 
sequence of points x, Ε S such that 


lim x, = sup S. 


If {.X,,} is any sequence of points in S which converges, then 
inf S < lim x, < sup S. 
3. True or false? If {z,} is a sequence of complex numbers which converges, 
then the sequence {z, 1} converges. 


4. True or false? If x, converges to x, then the greatest integer in x, converges 
to the greatest integer in x. 


5. True or false? If X,, converges to X, then the angle between (the vectors) X,, 
and X converges to 0. 


6. If 
Ky Sha ae een ee 
and x, converges to x, find the smallest N such that 
|x — x,| < 10-2, n>N. 
7. If x, > 0 and x, converges to x, then ./ x, converges to / x. 


8. Let A be a square matrix. Look at the sequence of its powers. Show that if 
A" converges to B, then AB = B. Give an example where the sequence {A”} does 
not converge, yet | A”| remains bounded. 


9. Let z be a complex number. Prove that the sequence 
gn 
n! 
is bounded. From the fact that it is bounded, show that it converges to 0. From 
this fact prove that, if € > 0, there is a constant K such that 


n 
125} < Ken 
nN: 


for all except a finite number of 7’s. 


10. Let S be a (linear) subspace of R”. If X is a vector in R”, let P(X) be the 
orthogonal projection of X onto the subspace S. Show that if X, converges to 
X, then P(X,,) converges to P(X). 
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11. Let {4,} be a sequence of invertible k Χ k matrices. Suppose A, converges 
to A but A is not invertible. Show that 


“lim | Az! | = 00”, 
n 


12. Prove that if x, € Rand x, converges to x, then the sequence of arithmetic 
means 


6 = (01 ἘΠ τος PX) 


also converges to x. 


2.2. Convergence Criteria 


It is extremely important to develop criteria which will ensure that a 
sequence converges, in spite of the fact that we do not know the limit 
explicitly. How can we tell if a sequence converges? One crude test is the 
boundedness of the sequence 


|X| < M, {alan Ua aire poem 


Every convergent sequence is bounded; hence, boundedness is a necessary 
condition for convergence. But it is a very long way from being sufficient. 
Many bounded sequences fail to converge. The interesting sufficient con- 
ditions are derived from the completeness of the real number system. 


Theorem 2. Let 
Χ, OX. X33 ..-: 


be a monotone-increasing sequence of real numbers. The sequence converges 
if and only if it is bounded. 


Proof. The content here is in the “if” half of the theorem. The 
hypothesis is then that 


χΧι ΞΟ Χ,ΩΞΟ ΧΩ "τ Ξ ὃ 


where 6 € R. Thus ὁ is an upper bound for the set of values of the se- 
quence. Let x be the least upper bound for that set: 


x = sup {x,;n © Z,}. 


Let r > 0. Then x — r is not an upper bound for the set of x,’s. Thus, 
there exists a positive integer N, such that 


Xv, > X— Pr. 
Since x, < X,+4; <x, we have 
KAP Bem XX er; n> N,. 


Theorem 2 is another in the list of reformulations of the completeness 
of the real number system. It is just the sequential form of the existence 
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of least upper bounds. Evidently, there is a companion result about 
monotone decreasing sequences. 
Let x,, X.,%X3,... be any bounded sequence of real numbers. For 
each n, define 
a, = inf {x,;k > n} 
b, = sup {x,3k > n}. 


Then we have two monotone sequences, one increasing and the other 
decreasing: 


(2.8) 


Ωι ΞΞ 4; Ξ3α: ΞΞ3 τ. ΞΞ δ: ΞΞ ὃ, <b. 


The limit inferior and limit superior of the sequence {x,} are then defined 
by 
lim inf x, = lima, 


(2.9) . : 
lim sup x, = lim 8,. 


Obviously, 
lim inf x, < lim sup x,,. 


Equality holds if and only if {X,} converges. If it converges, it converges 
to the common value of the lim inf and lim sup. 


Theorem 3. Let {x,} be a bounded sequence of real numbers. The se- 
quence converges if and only if 


lim inf x, = lim sup x,. 


If it converges, it converges to the common value of the limit inferior and 
limit superior. 


Proof. Define a, and ὁ, by (2.8). First, suppose the sequence {x,} 
converges to the number x. Given € > 0, we can find an N such that 


x—€<x,<x4+ €, n> WN. 
Therefore, | 
x—€<a,<x+e 


x—€<b,<x+4+ €, n>wN. 
It should now be clear that 
x = lima, = lim B,. 
Now, suppose that we are given a sequence {x,} with 
lim inf x, = lim sup x,. 


Let x = lim a, = lim J, and let us show that x, converges to x. Lete > 0. 
There exist positive integers M, N such that 


x—€<a,<x+e, no>M 
x—€e<b,<x+ 6, nN. 
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From the definitions of a, and b,, we then have 
X= € = Xi ΖΜ 
χΧ, XS, n> WN. 


Thus, 
[x ~ x,|< €, n> max (M,N). 


One frequently sees the notations lim x, and lim x, employed for 
lim sup x, and lim inf x,, respectively. In attempting to picture the geomet- 
rical relationship of the superior and inferior limits to the sequence {x,}, 
the following observation (which we state only for lim sup) is sometimes 
useful. If s = lim sup x, and € > 0, then x, > 5 - € holds for only a 
finite number of values of n and x, > 5 — € holds for infinitely many 
values of ἡ. 

The completeness of the real number system will (evidently) be 
reflected in some type of completeness of Euclidean space R”. The se- 
quential form of that completeness is conveniently phrased in terms of 
the Cauchy convergence criterion. 

Start with a sequence {X,} in R™. [f it converges to some point_X, 
then the various X,,’s with large subscripts must be close to one another, 
because they are all close to the limit point YX. 


Definition. A Cauchy sequence is a sequence {X,} with this property. 
For each € > 0, there exists a positive integer N. such that 


(2.10) |X, — X,]<e, k>N,, n>N.. 
We have just commented that every convergent sequence is a Cauchy 


sequence. In that case, a positive integer N. (2.10) can be determined 
precisely this way. If X¥, converges to X, choose N. so that 


> mee a <4; “ΣΝ. 
Then (2.10) is satisfied. 


Theorem 4 (Completeness of ἈΠ). Every Cauchy sequence in R™ con- 
verges. 


Proof. Suppose 
) ΞΘ χει ame) ΞΕ, re 
is a sequence of points in R”. For each coordinate index j 
[Xing — Xn |< |X, -- Xf 


Therefore, if {X,,} is a Cauchy sequence, each of the m sequences {x,,;} is 
a Cauchy sequence in R. If we show that each of those sequences con- 
verges, we will know that the sequence {,} converges (Theorem 1). In 
other words, we need only prove the theorem in R'. 
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Let {x,} be a Cauchy sequence of real numbers. How will we find a 
number x € R to which it converges? First, let us note that the sequence 
is bounded. By the Cauchy condition, there is a positive integer N such 
that 

|x, — x, |< 5, ΚΞΝ, n>N. 
In particular, 
[χν Ξξ an Eee n>N. 
Let M be the largest of the numbers 
Beate as ΠΧ νὸν 9 ar | xn | 
and plainly |x,|< M for all n. 
Now, let 
x = lim inf x,,. 
We remind the reader what that means: 
a, = inf{x,;k > ἢ) 
a,<a,<a4,;5:°: 
x = lim a,. 
We have just used the completeness of the real numbers, in the form of 
Theorem 2, to tell us that x exists. We claim that x, converges to x. 
Let € > 0. We wish to show that x, is in the open interval (x — ε, 


x + €), except for a finite number of n’s. Why is any x, in that interval? 
Look at the definition of x. We can find N so that 


(2.11) x —€<ay<™X. 
Now, look at the definition of ay. Together with (2.11), it tells us two 
things: 


(i) x, > x—eforalln>N. 
(ii) There exists some k > N such that x, < x + (€/2). 


(In (ii), we could replace x + (€/2) by any number greater than x.) 
Now apply the Cauchy condition. There is a positive integer P such 
that 


(2.12) [χ, --χ,ς-ξ, ke. nS? 
We may assume that the N in (2.11) is greater than P, because, if (2.11) 


holds for a particular N, it holds for every larger N. If N > P, then (i), 
(ii), and (2.12) tell us that 


x—€<x,<x+6, now. 


EXAMPLE 4. One of the most useful special cases of Theorem 3 is the 
following. Suppose {X,} is a sequence and 
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(2.13) (Ao 05 2 ἢ ΞΞ- 1. 2. ὅ, τς 
Then {X,} is a Cauchy sequence. Why? Suppose k < n. Then 
ΙΧ, ae Xn | = | X;, - Xx+11 ae ΙΧ, εἰ ΩΣ Xy+2| oF a a |Xn-1 ΕΝ X,| 
< 2-* Ἔ 2.(ΚῈ 1) ΒΒ aes -+. 2-(a-) 
= 2(2-: = 2.5) 
< 27k 1), 


Consequently, any sequence in R” which satisfies (2.13) is convergent. 


EXAMPLE 5. The Cauchy condition on a particular sequence frequently 
arises in this way. In addition to the sequence of points ¥,, we have a 
sequence of sets S,, S,, S;,... with these properties: 


(i) S$; > S, D S35 τὸ} 
(ii) X, € S,; 
(iii) lim diam (S,) = 0. 
Here diam (S) is the diameter of the set S: 
diam (S) = sup {|X — Y|; X —€ S, Ye 4}. 
This is finite if the set S is bounded, i.e., if S is contained in some ball 


about the origin. Clearly (i), (ii), and (iii) imply that we have a Cauchy 
sequence: 


|X, — X,|< diam (Sy), ΚΞΝ, n>N. 


In fact, the existence of such a sequence of sets is just a reformulation of 
the Cauchy property. 


Exercises 


1. True or false? If | X,| > |X| > ---, then the sequence {Χ΄,} converges. 
2. Give an example of a sequence for which 
lim | X, τ Xn+1 | = 0, 


but which is not a Cauchy sequence. 


3. Describe a Cauchy sequence of rational numbers which does not converge 
in the rational number system. 


4. If z is a complex number, what is 
lim inf | z*|? 
5. True or false? If the sequence {.X,,} converges, then the set of norms 


{| Xx, ale ne Z +} 
has a largest member. 
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6. True or false? If the infinite series >) X,, converges, then the set of norms 
{| X, |; ne Z.} 
has a largest number in it. 


7. Let {x,} be a bounded sequence of real numbers. Let A be the set of real 
numbers ¢ such that x, < ¢ for only a finite number of n’s. Show that 


lim inf x, = sup A. 

8. True or false? If z is a complex number and |z| > 1, then >) Ζ" diverges. 

9. True or false? If A is ak Χ k matrix and|A|> 1, then >) A” diverges. 
10. Let 

x, = (0 -- 2: — 22) --- ([ = 27"). 

Prove that the sequence converges and lim x, τέ 0. 

11. Let {x,} and {y,} be (bounded) sequences of real numbers. Prove that 

lim sup (x, + y,) < lim sup x, + lim sup y,,. 


12. Prove this generalization of Example 4. If {X,,} is a sequence in R™ such that 
2a | Xn ae Xn+i| < οὐ 


then {X,} is a Cauchy sequence. 


13. True or false? In R!, every convergent sequence is the sum of an increasing 
sequence and a decreasing sequence. 


14. Assume the monotone convergence theorem. Prove that every non-empty 
set of real numbers which is bounded above has a least upper bound. 


15. If you knew that every Cauchy sequence in R converged, how would you 
prove the monotone convergence theorem? (You will need to use the Archi- 
medean ordering property.) 


2.3. Infinite Series 


Now we shall see what the convergence criterion of the last section 
tells us about infinite series. We shall concentrate on the two most impor- 
tant classes of infinite series, namely, series with positive terms and ab- 
solutely convergent series. 

We shall make frequent use of the series analogue of sums and scalar 
multiples of convergent sequences: If ΤΣ X¥, = S and >) Y, = 7, then 
> (cX, + Y,) = cS + Τ. 

Suppose that we have an infinite series in which the terms are non- 
negative real numbers: 


da Xn 
n 


x 0, 
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Then the partial sums constitute an increasing sequence: 
nn 
δ» “:: > XK 
k=1 
§,S 8, S53 ἀρὰς, 


If that sequence is bounded, the monotone convergence theorem tells 
us that it converges and hence (by definition) the infinite series converges. 
If the sequence {s,,} is not bounded, it does not converge and so the infinite 
series diverges. 

The two possibilities for a series with non-negative terms are con- 
veniently described by the notation 


yx, = 56 (convergence) 


= ss (divergence). 


EXAMPLE 6. Two simple examples with which the reader should be 
familiar are 
] ] 
ay Στ 
The first series diverges, because the grouping 
(2.14) L+44+G4+)+QG+ot+44+)t-- 


shows that the 2nth partial sum exceeds (ἡ + 1)4. There are several ways 
to show that the second series converges. One way is to verify by mathe- 
matical induction that 


] Ι ] ] 
(2.15) τα τ oe 


n n 


It then follows that 
1 
dS? 
Notice that, once we have verified (2.15), the monotone convergence 
theorem guarantees that the series converges, but it does not tell us what 
the sum of the series is. As a matter of fact, 
1 π2 
Lee 


but that is hardly obvious. 


Given the non-negative series >} x,, it is not always easy to deter- 
mine whether or not the partial sums are bounded. Often, a certain amount 
of cleverness or experience is needed. A beginner would probably have to 
fiddle with the harmonic series >) 1/n for quite a while to guess whether 
or not it converged; and, even if he guessed that it diverged, it might be 
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another while before he thought of proving it by grouping the terms as 
in (2.14). We showed that 


1 
LS 
by verifying the inequality (2.15), which is probably not the first thing one 
would think of, upon being confronted with that infinite series. 
As one builds a stockpile of specific series which are known to con- 


verge or diverge, it becomes possible to test new series for convergence by 
comparing them with the known series. If 


In < 0 
and 0 < x, < y,, obviously 
>) Xn < οο. 
In fact, it is enough to know that x, < y, for all sufficiently large n: 
Xn < Vn n = N. 
Similarly, if . 
> fe mas OO 
and x, > y, > 0 for all sufficiently large n, then >) x, diverges. 


EXAMPLE 7. In Section 2.1, we verified that the series >) z" converges 
for all complex numbers of absolute value less than 1: 


1 οο 
=e ae |z|< 1. 
In particular, 
(2.16) SX = ees 0<x<l. 


A number of different series can be seen to converge by comparing 
their terms with the terms of a geometrical series (2.16). For instance, 


(2.17) Σ ἢ; <x 
because 
n2 2 n 
(2.18) —— < (3) ‘ for large n. 
2° 3 
Why ? The inequality (2.18) says that 
(Zy'n? <1, for large n 


which is true, because 


lim n2t”7 = 0 


for any fixed ¢ such that 0 < t < 1. This last assertion can be proved as 
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follows. Since 


: no 
kg deans 


and ἐ < 1, there exists an N such that 


oe 


ΐ « ( | 1) 9 nr Φ 


(n + 1)20"*1 < μϑ", n> WN. 


Thus the sequence {n?z"} is monotone decreasing for large n and, ac- 
cordingly, it converges. If the limit were not 0, we could divide and obtain 
the contradiction 
= lim n?22” 
lim (m + 1)22"*! 


Eee, | n\* 1 
=tim(~45) + 


The reader should be aware that the series 
x 
7 n! 
converges for every real number x. For the moment, let’s worry only about 
non-negative numbers x. If 0 < x < 1, the series obviously converges. 
The interesting point is that it converges for large x. Fix such an x. What 
happens to 
x. 
n! 
as n gets large? We pass from the nth term to the (x + 1)th term by mul- 
tiplying by x/(n + 1). Once n is large enough so that n + 1 > x, we are 
multiplying by a number less than 1. It should be clear now that the series 
converges, and converges faster than a geometric series. If we want to be 
precise, we can phrase it this way. Given x, choose a positive integer N 
such that 


Then 
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The series which we have just been discussing is the power series for the 
exponential function and we shall say more about it shortly. The reader 
should be familiar with the case x = 1 which defines the number e: 


= I Ι I 
OS 2 1. εἴς I++ ay t+ π|τ a Te 


n= 


Now that we have indicated what the monotone convergence theorem 
tells us about non-negative series, let us see what the Cauchy convergence 
criterion tells us about infinite series. Suppose that we have a (now vector- 
valued) series with terms X, and partial sums S,: 


ὌΝ Ss 3 X,. 
k= 1 


By definition, the series converges if the sequence {S,} converges. Ac- 
cording to Theorem 4, the series converges if and only if {S,} is a Cauchy 
sequence. The crudest possible estimate for the distance from S, to S, 
(with n > k) ts 
[δὲ —S,)=|Xear Fo +X 
τ | Xia | 5 δον [Χ,]. 


How can we guarantee that the last sum is small, provided that & and n 
are both large? Well 


[Xena] + τ. +1[X, [<6 N<k<n 
means that 
(2.19) > ie. 
n=N+1 
What does it mean to say that, for each € > 0, there exists an N so that 
(2.19) holds? It means that 


(2.20) 241 X,| < οο 


i.e., that the series of numbers (2.20) converges. An infinite series such that 
(2.20) holds is said to converge absolutely. What we just observed was this. 
If an infinite series (in R™) converges absolutely, then it converges. The 
converse is by no means true. Many series converge without converging 
absolutely. In a sense, absolutely convergent series are the ones which 
“really” converge, because for such series we can claim that the sum of 
the series is just the sum of all the vectors X,. We may commute and as- 
sociate the vectors in any way before we sum. Let us see why. 

Suppose that we have an absolutely convergent series >) X,. The sum 
S satisfies 


IS— Σ χε! YX 
In fact, if A is any subset of the positive integers, then 
(2.21) [S25 Xn = Σ Axl: 
ΚΕΑ k€A 


Sec. 2.3 Infinite Series 


It is this fact which makes all rearrangements possible. Let A,, 4>,... 
be a sequence of subsets of the positive integers such that each n is in 
precisely one A,: 


Z,=UA, 


A, O\ A, = ©, nx k. 


For each n, the series 


Σ, αι 


ΚΕΑ, 


converges, because it converges absolutely: 
DL |X| < |X| < ©, 
kKEAn k 
and we assert that 
2 X;, = Σ ( b> X;,). 
k n kKEAn 
In other words, if 
Y,= » Χι 


kEAn 


then the series }) Y, converges and 
Σ Υ ΞΕ ΞΞ pa X,- 
That conclusion can be obtained directly from (2.21): 
|S — Yils 2d 1X! 


IS-(%4+Y)|< SY |X| εἰς. 


Κα (A1U 43) 


EXAMPLE 8. If a series converges but is not absolutely convergent, one 
must be very careful about rearranging the order of the terms or grouping 
them in various ways in order to find the sum of the series. For instance, 
the alternating harmonic series 


] 
2 (--Ἰν" — 
τ n 
converges (see Exercise 4); however, we cannot compute the sum as 


ΣΡ ΕΞ Σ ἊΣ 
n odd 


m ¢ven 


because neither of those series converges. A well-known fact, which we 
shall not stop to prove, is this. If $5 x, is an infinite series of real numbers 


which converges but does not converge absolutely, then not only are 
there rearrangements of the order of the terms which yield divergent 
series, in fact, if s is any real number, the order of the terms can be so 
rearranged as to yield a series which converges to s. Evidently, unless a 
Series is absolutely convergent, one should not try to think of the sum of 
the series as being “the sum of all the terms”. 
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EXAMPLE 9. Let A beak Χ k matrix with complex entries. Suppose 
| A| < 1. Then the series 


yA, (A =D 
n=0 


converges absolutely because | A”| << | A|",n 4 0. It should converge to 


wanes 
[—A 


but that doesn’t exactly make sense. What does make sense is the inverse 
of the matrix 7 — A. Let 
(2.22) B=SAa,  |Al<]. 

0 


It is easy to verify that | 
(2.23) BU — A)=U— 4)8 -- ] 
i.e., J — A is invertible and B = (J — A)"!. 
Let A be any k x k matrix with complex entries. We define the ex- 
ponential of A to be 


exp (A) = e4 = I+A+ay At + 


(2.24) 2 ik 
Ξι aie 
0 : 
Again, the series converges absolutely because (see Example 7) 
~ Ι n -- ΕΝ ι᾿-. BVA 
du \ 514 <P ilar =e 1. 


If Bis a matrix which commutes with A, 4B = BA, then we can show 
that 


etAtB) — eAgB — ρ8ΒρΆ 


For any fixed matrix M 


Thus, 
οο l > 
ee = > nit eF 
bie τι ] n % ds k 
=a (% a") 
Since 
1 | 
n k Ae ᾿ 
we may regroup to obtain 
6468 — > >| : A" BE, 
ΝΞΟ ktn=N n 
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Since AB = BA, that says 


6468 — > ΕΒ + BN = ρ(11:8, 
nN! 


Exercises 


1. Let x and ¢ be any positive real numbers. Show that 
x < {* 
n! 
for all sufficiently large ἡ. 
2. True or false? If 0 < ¢ < 1, then 
nit” = co, 
3. In view of Exercises 1 and 2, what can you say about the behavior of the 
nth root of 1! as n gets large? 


4. Let {x,} be a sequence of non-negative real numbers which converges mono- 
tonely to 0. Prove that the infinite series 


pa (—1)"x, 


converges. 


5. Let >) X, be an absolutely convergent series of vectors in R*, and let { Y,} 
be a bounded sequence of vectors in R*. Show that the series of numbers 


ΣΟ: 
is absolutely convergent. 

6. Prove the inequality (2.21), meaning: If A < Z, then X;, converges and 
(2.21) holds. = 

7. Let {X,} be a sequence of vectors in R* such that 

pa |X, — Δα] < co 

Prove that the sequence {.X,,} converges. (Such a sequence is called a fast Cauchy 
sequence.) 


8. By comparison with a geometric series, prove the following about a 
(bounded) sequence of complex numbers {x,}. 
(a) (Ratio test) Let 


Xn+1 


s = lim sup 


The series >) x, converges absolutely if s < 1. 
(0) (Root test) Let 
t = lim sup ./|x,|. 


The series >) x, converges absolutely if ¢ < 1 and diverges if ¢ > 1. 
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9. Show that 
> η (3/2) < CO 


by using induction to verify that 


5. «-- 3 — 2η.- (112). 


10. Let {x,} be a monotone decreasing sequence of positive numbers. Prove that 
> x, converges if and only if 


> 2Κχ,ι 
k 
converges. 


11. Use the result of Exercise 10 to investigate the convergence of 


yay, q> 0. 


12. Let x,, x2,... be any sequence of non-negative numbers such that 
> Xe OO: 
nm 

Prove that there exist positive integers N,, N2,... with these properties: 


ὦ Ni <M <N3<°:'3 
(ii) lim Nz = οὐ, that is, {N,} is unbounded; 
k 


(1) > Ni Xk < οὐ. 
k 
13. Prove that 
| e*>14+x 
for all real numbers x. One procedure might be to verify in order: 


e*>1+4+x, x>0 


a 1 
= ST—-, 


ee >14+x, x> --Ἱ1Ἱ 
ee >14+x, χε Δ. 


0<x< 1 


14. Use the result of Exercise 13 to show that 


exp (215) <11 + 21 <exp (2) 


for all complex numbers z with |z| < 1. 
15. If Aisak-:x k matrix, show that matrix e4 is invertible. 


16. Let A and Bbek x k matrices of norm Jess than 1. Discover a simple rela- 
tionship between (I — AB)~! and (I — BA)~!. In what sense does that relation- 
ship hold for all A and B? 


17. Let {w,} be a sequence of complex numbers. The infinite product 


II w, 
n 
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is said to converge if the sequence of partial products 
Pn LL We = W1W2°°-? W, 


converges to a non-zero number w. Prove the following, by using the results of 
Exercises 7 and 13: If {z,} is a sequence of complex numbers (z, + —1) such that 


24 |2n| < oo 
nn 


then the infinite product 


IL d + z,) 
converges. 


18. Let >) x, be a series of real numbers which converges but does not converge 
absolutely. Show that, if ¢ is a real number, the order of the terms in the series 
can be rearranged so that the rearranged series converges to ¢. Hint: Take enough 
positive terms so that their sum just exceeds ¢, then add enough negative terms 
so the sum is just under 1, etc. 


*19. Let >) Δ΄, be a series of vectors in πὶ which converges but does not con- 
verge absolutely. Let S be the set of all vectors which are sums of rearrangements 
of the series. What kind of a set can S be? 

*20. Prove that 

det e4 = ett (4) 


for all k x k complex matrices A (det = determinant function; tr = trace func- 
tion). 


2.4. Sequential Compactness 


The sequence {X,} converges to the point X if X, is near X for all 
sufficiently large ἢ. There are many situations in which we are given a 
sequence and we don’t need to know that it converges. What we need to 
know is that there is a point XY at which the sequence accumulates, i.e., 
X, 1s near X for infinitely many values of zn. 


Definition. The point X is a point of accumulation (accumulation 
point) of the sequence {X,} if every neighborhood of X contains X,, for 
infinitely many values of n. 


We can say it another way: X is an accumulation point of {X,} if, 
for each € > 0 and each positive integer ἢ, there exists k > n such that 
|X op χα] τ ε. 


If {X,,} converges to X, then clearly Χ' is the unique point of accumulation 
of the sequence. 
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EXAMPLE 10. Let r,,7r,,73,... be the sequence which consists of the 
positive rational numbers, enumerated according to the scheme 


Then, every non-negative real number is an accumulation point of the 
sequence {r,}. 


EXAMPLE 11. Beware of working with coordinates when discussing 
accumulation points. Consider in R? 


X= (0, 1), n odd 

X= tA), n even. 
The sequence of first coordinates is 0, 1, 0, 1, . . . which has two accumula- 
tion points in R, 0 and 1. The sequence of second coordinates is 1, 0, 1, 0, 
...and it has the same accumulation points. In particular, 0 is an accumu- 


lation point for the first coordinates and for the second coordinates. We 
cannot conclude that (0, 0) is a point of accumulation of the sequence in 


ΑΚ. 

Definition. The sequence Ὗ,, Y2,.Y3,-.. is a subsequence of the se- 
quence X,, X,, X3,... if there exist positive integers n,,N,,N3,... such 
that 

Gi) nj <n,<nj<:-::; 

Gi) Y, = X,- 


In other words, a subsequence of {X,} is any sequence X,,, X,,, - 
with n, <n, <---. We shall usually describe this by saying that {X,,} 15 
a subsequence of {X,}. The point of introducing subsequences at this stage 
could hardly be missed. 


Lemma. The point X is a point of accumulation of the sequence {X,} 
if and only if some subsequence of {X,,} converges to X. 
Proof. We merely remark that, if X is an accumulation point, then 
|X i. Kas | < I 


for some 1,. We have 
|X —~ Xn: | < δ 
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for infinitely many n,’s. Choose one so that n, > n,. Continue in that way 
and obtain 
| 


|X — X,,| <> 


MyM, <N3<--- 


The completeness of the real number system guarantees that bounded 
sequences in R” have accumulation points. A sequence can wander aim- 
lessly; however, if it stays in a bounded part of R”, it must accumulate 
somewhere. This property is usually called the “sequential compactness” 
of bounded parts of R”. 


Theorem 5 (Bolzano-Weierstrass). Every bounded sequence in R™ has 
a point of accumulation. Equivalently, every bounded sequence in R™ has a 
convergent subsequence. 


Proof. We shall work with the coordinates, and, as we noted in 
Example 11, we must exercise some care. Let the sequence be 


Dee © ee a 


as usual. Since | x,,;| < | X,|, each of the m coordinate sequences is bounded. 
Suppose we have proved the theorem for bounded sequences in R!. The 
proof for R” could then be given this way. Pick a subsequence yee, ΕΣ: 
for which the first coordinates converge in R!. That subsequence is 
bounded. Hence, it has a subsequence for which the second coordinates 
converge. This new sequence has the property that the first coordinates 
of the vectors converge and the second coordinates of the vectors con- 
verge. In a finite number of steps we shall arrive at a subsequence for 
which each of the coordinate sequences converges. 

So, our problem is to prove the theorem for a bounded sequence 
{x,} in R'. That is now easy, because the limit inferior of the sequence is a 
point of accumulation of the sequence. If we don’t wish to refer to that 
concept, we can define directly 


x = sup {t; x, > 1 for infinitely many 7} 
and verify that x is an accumulation point of the sequence. 


Corollary. A bounded sequence in R™ converges if and only if it has 
precisely one point of accumulation. 


Let us outline another proof of the Bolzano-Weierstrass theorem 
which has a slightly different intuitive basis. We can take the bounded 
sequence, multiply it by a non-zero scalar and then translate it so that 
the new sequence is in the box 


B={X;0<x,<1,j=1,...,m}. 
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Therefore, we may as well assume that the given sequence is in B. This 
box is the Cartesian product of the closed unit interval with itself m times: 
B=Ix.:---xI 
={X;x,eh,l<j<m}. 
Now cut J in half on each coordinate axis, and see how that subdivides B. 
It exhibits B as the union of 2” boxes, each of which is the Cartesian pro- 
duct 
Ij Kes KI HX SI, lm) 
of m closed intervals. Each J; is either [0, 4] or [4, 0]. These 2” boxes 
overlap a little, but that is irrelevant. What matters is this. One of those 
boxes must contain X, for infinitely many values of n. Pick one such box 
and call it B,. Now subdivide B, into 2” boxes, as we did with B, and let 
B, be one which contains X, for infinitely many values of . (See Figure 4). 
If we continue this subdivision process, we obtain a nested sequence of 
boxes 
B, > B, > B, >>>: 
each of which contains X, for infinitely many n; and the edges of B, have 
length 2.5. In particular, 


lim diam (B,) = 0. 
Clearly we can choose n, <n, < --+ such that 
Xn, © By- 


This (sub) sequence is therefore a Cauchy sequence and converges (Exam- 
ple 5). 


FIGURE 4 
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After we had chosen the boxes B,, we did not need to use the Cauchy 
criterion. We might, for example, have applied the nested intervals theorem 
in each coordinate. 


Exercises 


1. Prove this. In ΚΑΊ, the bounded sequence {x,} has a smallest and a largest 
accumulation point. They are (respectively) lim inf x, and lim SUD X;,. 


2. Let z be a complex number of absolute value 1: 
z= ee? 0<0<217 


What are the accumulation points of the sequence {z"}? Distinguish between the 
case where @ is a rational multiple of 2% and the case where it is not. 


3. Let {X,} be a sequence. Suppose that Y;, Y;,...isa sequence of accumula- 
tion points of {X,,}. Show that any accumulation point of the sequence { Y,,} is an 
accumulation point of the sequence {X,}. 


4. If {X,} is a Cauchy sequence and if some subsequence converges to _X, then 
X = lim X,. 


5. Let {x,} be a bounded sequence of real numbers such that ΙΧ, = Xai | = Lior 
each n. Show that the sequence has only a finite number of accumulation points. 


6. If {x,} is a bounded sequence of real numbers such that [χ, — Xna1| > 1 for 
each n, can it have an infinite number of accumulation points ? 


7. Suppose that {x,} is a bounded sequence of real numbers such that 


lim sup x, < sup x,. 
n n 


Show that, among the numbers x,, there is a largest one. 
8. Find a sequence {z,} of complex numbers with |z,| < 1 such that 


(a) no point z with |z| < 1 is a point of accumulation of the sequence; 
(b) every point z with |z| = 1 is an accumulation point of the sequence. 


*9, True or false? Let A bea k Χ k matrix with complex entries. The set of all 
accumulation points of the sequence {A"} is bounded. 


2.5. Open and Closed Sets 


The title of this section mentions two very special classes of sets. One 
can do analysis in Euclidean space without either concept; however, quite 
a bit is lost if one does. The concepts of open set and closed set shed light 
on questions of convergence and on the geometry of mappings. 


Definition. The set U is open (in R™) if it is a neighborhood of each of 
Its points. 
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Thus, U is open if and only if the following condition is satisfied: For 
each point X in U, there exists a positive number r such that the open ball 
B(X; r) is contained (entirely) in U. 


Theorem 6. The union of any collection of open sets is open. The inter- 
section of any finite collection of open Sets 1s open. 


Proof. Suppose {U,} is a family of open sets. Let 
Ca. 


χε U,then X ε U,, for some a. Therefore (that) U, is a neighborhood 
of X. Since U contains U,, U is a neighborhood of X. 
Suppose U,,..., U, are open sets. Let 


U = ( U;,. 
Let Χα U. For eachk, 1 < k <n, there is a number r, > 0 such that 
B(X;1r,) < U,. 
Let r be the least of the numbers r,,...,7,- Then 
BUX; r) < U. 


Theorem 6 is virtually trivial. It is called a theorem, because it states 
properties of open sets which are used so often. 


EXAMPLE 12. Every open ball BCX; r) is an open set. If ε BCX; r) 
then B(Y; t) < Β(Χ; γ) where 


t=r—|X—Y\. 
Thus, the union of any collection of open balls is open: 


LL) BUX, 3 ra): 


Furthermore, every open set is of the last type. It is the union of those 
open balls which it contains. 


EXAMPLE 13. Let us look at open sets in R'. Each open interval (a, δ) 
is an open set in R!. On the other hand, an interval (a, δ] is not open in Αἱ, 
because ὁ € (a, b] but no open interval about ὦ is contained in (a, δ]. The 
unbounded interval (a, co) is open in R!. 

Every open set in R! is a union of open intervals (a, δ). In this 1-di- 
mensional case, the open set U can be expressed as a union of intervals 
in a very special way. That is because open intervals in R! have this special 
property: If several open intervals have a point in common, their union 
is an open interval. (See Exercise 11.) Take any x in the open set U. Let 
I, be the union of all those open intervals which contain the point x and are 
contained in the set U. Then J, is an open interval (possibly unbounded). 
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Therefore, foreach x € U there is a largest open interval J, which contains 
x and is contained in U. If y Ε J,, then plainly J, = [,. In other words, all 
the points in J, belong to the same largest interval in U, namely 1... Con- 
sequently, if x, y © U then either J, = J, or the intersection of J, and J, 
is empty. How many different intervals J, are there? Only a countable 
number. (See Exercise 12 and Appendix.) Thus every open set in Αἴ is 
uniquely expressible as the union of a countable collection of open intervals 
which are pairwise disjoint. 


EXAMPLE 14. Let’s look at the space of k Χ k matrices (real or com- 
plex entries). Let U be the set of invertible matrices, i.e., matrices A such 
that there exists 4.1: with AA“! = 4 14 = J, Is Uanopenset? What would 
that say? It would say that, if the matrix A is invertible then every matrix 
(sufficiently) near A is also invertible. We showed earlier (Example 9) that 
every matrix near the identity matrix is invertible: If |7| < 1, then 


(Ι1-- ΤΥ! -Ξ- x 1"; 
n=0 
or if | J — S| < 1, then 
Sia a Sy 
n=0 


So maybe U is open. If A is invertible, how close must B stay to A to 
guarantee that B is invertible? Now 


A— B= AI — A7'B), 


or 
I —- A-1B = 4 1(Α — B). 
Thus, 
1-- A'B|<|A™'||A — Bl. 
Suppose 
] 

|\A—Bl< [ast 
Then 

|. — AIBl| <1; 


hence, 418 is invertible, and since A~! is invertible, that makes B invert- 
ible. To summarize, the set U of invertible matrices is open because, if 
A ε U, then U contains the open ball of radius | A~!|~! about the point 
A, 


Definition. The point X is a cluster point of the set S if every neighbor- 
hood of X contains a point of S which is different from X. 


Lemma. Let S be a subset of R™ and let X € Ἀπ. The following are 
equivalent (all true or all false). 
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(1) X is a cluster point of the set S. 
(ii) Every neighborhood of X contains infinitely many points of S. 
(11) There exists a sequence {X,} in S such that Χ, Ξε X and X = 
lim, X,- 


Proof. Exercise. 


The reader may have noticed the similarity of the concepts of “cluster 
point of a set” and “accumulation point of a sequence”. It is important 
to be clear about the relationship between the two ideas. If {X,} is a 
sequence in R”, then its image 


S = {X,3n Ε Ζ,)} 


is a subset οἵ μα, It is the set of all distinct vectors which occur as one 
(or more) of the terms of the sequence. A point X is a cluster point of S 
if each neighborhood of XY contains infinitely many points of S. Such an 
X is surely an accumulation point of the sequence, because each neighbor- 
hood, containing infinitely many points of S, must contain X, for infinitely 
many values of n. On the other hand, if Y is a point of accumulation of 
the sequence it need not be a cluster point of the set S. A simple example 
should make this clear. The sequence of real numbers 


ἐπ Νὰ 0 Fe Gear 


has two points of accumulation, 0 and 1. The image of the sequence is 
S = {0, 1}, and it has no cluster points at all. If all of the terms of the 
sequence {X,} are distinct, then every accumulation point of the sequence 
is a cluster point of S. 

The terms “cluster point”, “limit point”, and “accumulation point” 
are used interchangeably in most parts of mathematics. We have elected 
to apply “accumulation” to sequences and “cluster” to sets, as a reminder 
of the distinction discussed in the last paragraph. 


Definition. The set K is closed if every cluster point of K is in K. 


A closed set is one which is closed under (the process of taking) limits. 
From the last lemma, it should be clear that these conditions on a set K 
are equivalent: 

(i) K is closed. 

(ii) If {X,} is a sequence of points in K and if the sequence converges, 
then the limit of the sequence is in K. 


Theorem 7. A set S is openif and only if its complement (complementary 
set) is closed. 


Proof. Let T be the complement of S: 
T = {X © R";X € S}. 
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To say that 7 is closed is to say that, if Y ¢ 7, then X is not a cluster 
point of 7. Think about that. 


Corollary. The intersection of any collection of closed sets is closed. 
The union of any finite collection of closed sets is closed. 


Proof. This follows from Theorems 6 and 7. We describe the proof 
of the first statement. Let {K,} be a collection of closed sets. For each a, 
let U, be the complement of K,. Then each U, is an open set and, there- 
fore, the union 


U= IU, 
is an open Set. This union is the complement of 
Ka 1K: 


the intersection of the complements of the sets U,. Since U is open, K is 
closed. 


We should remark that closed balls and closed intervals are closed 
sets. In view of Theorem 7, there is no need for a separate list of examples 
of closed sets. Every example of an open set provides an example of a 
closed set (and vice versa). But, the human mind being what it is, it doesn’t 
follow that just because we know about open sets we’ll recognize a closed 
set when we bump into it. 


EXAMPLE 15. Let’s look at a famous closed set—the Cantor set. We 
shall refer to it often. Start with the closed interval [0, 1] in Αἰ. Remove 
the open middle one-third, i.e., the open interval (4, 4). What remains is 
the union of two disjoint closed intervals. Remove the open middle one- 
third of each of those intervals. (See Figure 5.) Now, remove the open 
middle one-third of each of the four remaining closed intervals. Continue 
ad infinitum. What remains is the Cantor set K. 


—_s pO . x OU! x ἐς 9 0! } z 
O 1 
FIGURE 5 


We obtained K by removing from [0, 1] an wie set U. That set is 
the union of the sequence of open intervals: (4, 5), (ᾧ, 3), (J, 8), .... Since 
U is open and [0, 17 is closed, 


K = {x € [0, 1];x € VU} 


is a Closed set. The set K is very thin. The lengths of the open intervals in 
U add up to 
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But, there are a lot of points in K—uncountably many. In particular, K 
contains many more points than the end points of the deleted intervals. 
(Those are the obvious points in K.) 

We can describe the Cantor set very nicely, if we use the ternary 
rather than the decimal expansion of points in the interval [0, 1]. The 
ternary expansion represents each x Ε [0, 1] as the sum of a series 


x= > a,>~” 
n=1 


where the “digits” a, are 0, 1, or 2. The digits are defined by locating the 
point x in a sequence of intervals, the lengths of which go down by a 
factor of 3 each time: 


3 
a, = sup {k © Z;a,37! + k3-2 < x} 


a; = sup {k Ε Ζ;-τ «αὶ 


a, = ϑυρίκ ε Z;a,37' 4+ --- +4a,_,3°% ? + k3 7 < x} 


The Cantor set contains all those points x which have a ternary represen- 
tation in which the digits are either 0 or 2 (no 1’s). So clearly there are just 
as many points in K as there are in the interval [0, 1]: Under the mapping 
(function) 


Se gee 5:2". (a, = 0 o 2) 
n=1 n=1 


the image of K covers [0, 1]. | 
The “sequential compactness” of R™ can be reformulated in the 
language of this section. For emphasis, we shall record two reformulations. 


Theorem ὃ (Bolzano-Weierstrass). Every bounded and infinite subset of 
Ἐπὶ has a cluster point. 


Proof. If S is an infinite set, we can select a sequence of points X, in 
S such that X, + X, whenever k + n (a sequence of distinct points). For 
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such a sequence, “contains infinitely many X,’s” is the same as “contains 
X, for infinitely many n’s”. 


Theorem 9. Let 
K, > K,>K,> oe 


be a nested sequence of bounded closed sets in R™. If each K,, is non-empty, 
then the intersection 


()K, 
1s non-empty. 


Proof. For each ἡ, there exists a point X, in K,. The sequence {X,} 
is bounded and, accordingly, it has a point of accumulation XY. Since X, € 
K,, for k > ἢ and since Καὶ, is closed, X must be in K,. 


Here is an amusing application of the weaker result. 


EXAMPLE 16. An analyst (of the mathematical variety) might proceed 
this way to show that the medians of a triangle meet in a point. Let the 
triangle be ABC and let X, Y, Z be the midpoints of the sides. By com- 
parison of similar triangles, it is clear that (1) the median lines of ABC 
are also the median lines of XYZ, (2) diam XYZ = 1 diam ABC. (See 
Figure 6.) Replace ABC by XYZ and repeat. Then repeat again, etc. Do 
you see where Theorem 9 comes in? 


FIGURE 6 


One might think of Theorem 9 as a slightly more geometrical way of 
stating the Bolzano-Weierstrass theorem. If (in Theorem 9) one knows 
that diam (K,) converges to 0, then the intersection of all the K,’s will con- 
sist of precisely one point. That result is weaker than Theorem 9. It is 
(essentially) a reformulation of the fact that each Cauchy sequence in R” 
converges. 
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Exercises 


1. Show that there are (at least) two subsets of R” which are both open and 
closed—R™” and the empty set. 


2. If S is a subset of R” and if S is both open and closed, then either S = R” 
or S is empty. (You might prove it first in R!.) 
3. If S is a subset of R”, what is the union of all closed subsets of S? 
4. Which of the following sets of complex numbers are closed? Which are 
open? 
(a) all z such that z = z*; 
(b) all z such that zz* > 2; 


(c) all z such that |z|< 1 andz +0; 
(d) all z such that |z] is rational. 


5. True or false? If (δ is closed, then S contains a cluster point of S. 
6. True or false? If S is a bounded infinite subset of κα, then among the cluster 
points of S there is a largest one. 


7. True or false? If every subset of S is closed, then S contains only a finite 
number of points. 

8. True or false? If every subset of S is open, then S contains only a finite num- 
ber of points. 

9. If Sis a set, the Xo-translate of Sis Χο + S = ἰχ + Y; Y © S}. Show that 
each translate of an open set is open and each translate of a closed set is closed. 
10. What can you say about scalar multiples of open [closed] sets ? 


11. If {7,; ἃ € A}is a family of open intervals on the real line and if the inter- 
section 
(\ Ta 

is non-empty, then the union is an open interval. (Remember the definition of 
interval.) 
12. Let {/,; ἃ € A} be a family of open intervals on the real line which are 
pairwise disjoint: 

ΤΣ ἘΞ Ou OEP: 
Prove that A is a countable set. 
13. Let A be a subset of R” and let B be a subset of Ἀπ. The Cartesian product 
A X Bisa subset of R”**. Prove that 


(a) if A and B are open, then A x Bis open; 
(Ὁ) if A and B are closed, then A x B is closed. 


14. Is the set of orthogonal k Χ αὶ matrices open? 


15. Let M bea linear subspace of R”. Let K be a closed subset of R”. Project the 
set K orthogonally onto M. Do you end up with a closed set? 
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16. Every (linear) subspace of R” is closed. 


17. A set Καὶ is perfect if K is closed and every point of K is a cluster point of K. 
Show that the Cantor set is a perfect set. 


*18. Every non-empty perfect set is uncountable. 


*19, Every closed set is the union of a perfect set and a countable set. 


2.6. Closure and Interior 


Let S be a subset of R”. The closure of S is the intersection of all 
closed sets which contain S. The interior of S is the union of all open 
sets which are contained in S. The boundary of S is the intersection of 
the closure of S with the closure of the complement of S. 

The closure of S will be denoted (δ. Evidently S is a closed set. It is 
the smallest closed set which contains S. If X is a point in R”, these con- 
ditions are equivalent: 

(i) Xe δ. 
(ii) Either ¥ € S or X is a cluster point of S. 
(11) There exists a sequence of points in S which converges to X. 


The interior of S will be denoted S° (although we shall not use the 
notation very much). Evidently S° is an open set—the largest open subset 
of S. The following conditions on the point X are equivalent: 

(i) χε S?. 

(ii) Some open ball BCX; r) is contained in S. 

(iii) There does not exist any sequence of points in the complement 
of S which converges to X. 


The following conditions on X are equivalent: 


(i) X is in the boundary of S. 
(ii) Every neighborhood of X contains a point in S and a point in the 
complement of S. 


EXAMPLE 17. Consider the open ball B(X,; r). It is an open set and 
therefore is equal to its interior. The closure of B is the closed ball 
B(X,;r). The boundary of B is the sphere of radius r about Χο: 


S(Xo3 17) = {X35 |X — Χο] =r}. 
In R?, the “sphere” is a circle. In R', the “sphere” consists of the two end- 


points of the interval B. 


EXAMPLE 18. Let’s look at some subsets of 2. In order to understand 
closure, interior, and boundary, one usually begins by drawing a set S 
as in Figure 7. Indicated are a point X in 59, a point Y on the boundary of 
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FIGURE 7 


S, and a point Z which is not in δ. The boundary of (this particular) S is a 
curve, and it does not matter whether S is the open region bounded by the 
curve or the closed region bounded by the curve. The interior, the closure 
and the boundary are the same for those two regions. A region of either 
type is a very special set. From the point of view of things which we are 
now discussing, the nicest set is one which is closed, bounded and equal 
to the closure of its interior (or, the interior of such a set). 

In the same figure, let T be the set obtained by deleting from S that 
part of the real axis which lies in S. The points on the deleted line segment 
are in the boundary of 7. Yet, somehow they seem “interior” to T in a 
weak sense. They are not in the interior of 7, because they are not even in 
T. But those points are in the interior of T. 

Let’s look at a set which is less nice. Let V be the set of all points 
(x, y) in R2 such that both x and y are rational numbers. Now V is (to say 
the least) scattered all over the place. Every point of ΚΖ is a cluster point 
of V; hence V = R?. Evidently, the interior of V is the empty set, because 
every point of R? is also a cluster point of the complement of V. What is 
the boundary of V’? It’s also all of R?. 

One final set which we discuss is the interval 7 on the real axis: 


I = {(x, 0}; x € (a, b)}. 


It should be clear that / is the corresponding closed interval on the real 
axis: 
I = {(x, 0); x ε [a, bj}. 


The interior of J (in R2) is empty, because each point of J is inthe boundary 
of J. The boundary of J is 1. If we were discussing the interval (a, δ) in 
~R', the interior and the boundary would be quite different. Hence, always 
bear in mind that such sets are not intrinsically attached to the geometric 
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object which we envision. Closure, boundary, and interior are defined 
relative to the ambient space. 


Definition. Let S be a subset of T. We say that S is dense in T if every 
point of T is in the closure of S. 


Density has to do with approximation. To say S is dense in T means 
that every point of 7 can be approximated (as closely as one might wish) 
by points in S, i.e., every point in T is the limit of a sequence of points 
from δ. The set of rational numbers is dense in R!. More generally, the 
set of points ¥ = (x,,..., χα) in R™ such that every x, is rational is dense 
in R”. Every set is dense in its closure. If K is a closed set, a subsets S 1s 
dense in K if and only if S = K. 


EXAMPLE 19. Once again, let us consider the space of k Χ k matrices 
(real or complex entries). Let S be the set of singular matrices, i.e., matrices 
which are not invertible. Then S is a closed set, because its complement 
is the set of invertible matrices, which is an open set (Example 14, p. 57). 
The interior of S' is the empty set. What does that say? It says that, if A 
is a singular matrix, then every neighborhood of A contains an invertible 
matrix. In other words, if A is singular, we can perturb A just a little and 
obtain an invertible matrix. How? Look at A + cl. Fora k x k matrix 
A, there are at most k numbers c for which A + c/ is singular, because 
when A + cl is singular 


det (A + cl) = 0 


and the determinant of A + cJ is a polynomial in c of degree k. The zeros 
of that polynomial are the characteristic values (eigenvalues) of A. One of 
those characteristic values is 0, because A is singular. Let the non-zero 
characteristic values be c,,...,c,. Let 6 be the minimum of |c, |, ...,|c, |. 
Then A + cJ is invertible for every c such that 0 < |c| < 6. In particular, 
A + ε1 is invertible for all small positive ε. Thus, the interior of the set of 
singular matrices is empty. We can phrase that in terms of the complement. 
The set of invertible k Χ k matrices is a dense open subset of Euclidean 
space of dimension k?. 


Exercises 


1. The interior of any set S is the complement of the closure of the complement 
of S. 


2. The union of the interior of S and the interior of σι — S is the complement 
of the boundary of SS. 


3. Find all subsets of R™ for which the boundary is empty. 
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4. True or false? If S and S° have the same boundary, then S° is dense in S. 
5. True or false? If S and S have the same boundary, then S is bounded. 


6. True or false? If S° is bounded and the boundary of S is bounded, then S 
is bounded. 


7. Show that the set S in R” is open if and only if SO Ac SQ A for all 


Ave KR: 
8. Let S bea linear subspace of R”. Show that, if the interior of S is non-empty, 
then S = R”. 


9. Let f be a real-valued function on the real line 


"ἢ 
R—> ΚΑ. 
Show that the graph of f has empty interior (in .Κ2). 


10. Let G be an additive subgroup of R, i.e., a non-empty set of real numbers 
such that (x — y) € G whenever x € Gandy ε G. Show that either G consists 
of all integer multiples of some fixed real number or G is dense in R. (Hint: Is 
there a smallest positive number in G?) 


11. Let 
N 
POY) = Σ Ben Xhy" 


be a (real) polynomial in two variables, p + 0. Let K be the zero set of p:K = 
{(x, »}); p(x, y) = 0}. Show that the interior of K is empty. Generalize to n 
variables. (The determinant of a matrix is a polynomial function of the entries, 
so Example 19 is a special case of this exercise.) 


12. A set S in R” is convex if, whenever X and Y are in S, the line segment 
between X and Y is in S. That is, S is convex if whenever X and Y are in S, 
tX+(1—yvYisin S foralO<r<1. 

(a) The closure of a convex set is convex. 

(b) The interior of a convex set is convex. 

(c) Let S be a set such that 4(X + Y) is in S, whenever X and Y are in S. 
If S is closed, then S is convex. 

#13, Show that the set of diagonalizable k x k matrices is a dense subset of the 

space of k x k complex matrices. (Hint: The matrices which have & distinct 
characteristic values are diagonalizable.) 


*14,. Use the result of Exercise 13 to prove the Cayley-Hamilton theorem: Every 
k x k matrix satisfies its characteristic equation. 


2.7. Compact Sets 


Compactness is an important geometrical property enjoyed by every 
closed bounded subset of R,,. The concept of compactness is concerned 
with covering a set by open sets. 


Sec. 2.7 Compact Sets 


Definition. Let S be a subset of R™. An (indexed) cover (or covering) of 
S is a family of sets {U,; αὶ Ε A} such that 


αι Us. 
A cover of S is called 


(a) an open cover of S if each U, is an open set; 
(b) a finite cover of S if the index set A is finite; 
(c) a countable cover of S if the index set A is countable. 


We shall deal almost exclusively with open covers. If {U,} is a cover 
of S, we also say that the family {U,} covers S. A countable cover is a 
sequence of sets. Thus, a countable open cover of S is a sequence of open 
sets {U,} such that each point of S is in (at least) one of the sets U,. 


EXAMPLE 20. Let S be any subset of R”. Then S has myriad different 
open covers. For instance, the family consisting of the single set R” is a 
finite open cover of S. Or, let ε > 0 and let 


Uy = BUX; €). 
Then {U,; X € S}is an open cover of S, which happens also to be indexed 
by the set S. Note that this particular cover {U,y; X¥ € S} is also an open 
cover of the closure S: Every point in S is within distance € of some point 
of S. Suppose we define 

Vy = BUX; | X)). 
Then {V,; X € S}is an open cover of Sif 0 ¢ S. Let 

B, = {X;|X| <n}. 


Then {B,} is a countable open cover of S, because it is a countable open 
cover of R”. 


Suppose that {U,; ἃ € A}isa cover of the set S. Let B bea subset of 
the index set A. The family of sets {U,;a © B} may or may not cover 
S, i.e., it may or may not be true that 
(2.25) Se US 

aceB 
If (2.25) is satisfied, then we call {U,; α € B} a cover of S subordinate to 
{U,;% Ε A}. It is a common convention in mathematics to shorten this 
terminology and say that {U,; a € B} is a subcover of the cover {U,;%6E 
A} provided (2.25) is satisfied. We shall follow this practice but the reader 
should note how important it is to keep in mind which set is being covered. 


Definition. The set K is compact if every open cover of K has a finite 
subcover. 


In order that K should be compact, the following must be the case. 
If we are given any family of open sets {U,} which covers K, we can extract 


67 


68 


Convergence and Compactness Chap. 2 


from that family a finite number of sets U,,,..., U,, which (jointly) cover . 
K. The proof that a given set K is compact must give a prescription for 
performing that extraction on each of the multitude of different open 
covers of K. The following theorem certainly provides us with many 
examples of sets which are not compact. 


Theorem 10. Every compact set is closed and bounded. 


Proof. Let K be compact. First, we show that K is bounded. One of 
the open covers of K is given by the sequence 


B, = BO; n), Pe ND Dice 

Since K is compact, some finite number of those sets cover K. Hence one 
of them covers K, because B, < B, Cc ::- 

How do we show that K is closed? Let Y be a point which is not in K. 
χε Καὶ let 

Uy = B(X; 4| X — Y |) 

Then {U,; X € K} is an open cover of K. Since K is compact, there exist 
X,,.-.-,X, in K such that 


Kc U) Ux,. 
1 


Let r be the smallest of the numbers 4| -- X,|. Then B(Y;r) is a 
neighborhood of Y which does not contain any point of K. We have 
proved that, if Y is not in K, then Y is not in the closure of K. Thus, the 
set K is closed. 


EXAMPLE 21. There is one type of (infinite) set which easily can be seen 
to be compact. Suppose {X,} is a convergent sequence and 


X = lim X,. 


Let S be the image of the sequence together with the limit point X: 
S={Y;Y =X or Y = X, for some n}. 


Then S is compact. We see this as follows. Let {U,} be any open cover 
of S. Then one of the sets U, must contain X. Let’s say X¥ Ε U,,. Since 
X, converges to X, the set U,, contains X, except for a finite number of 
n’s. Choose U,,,..., U,, so that they cover the X,’s outside U,, and 
{U,,,..-, U,,} is a finite subcover. 


Theorem 11. Let S be any subset of R™. Every open cover of S has a 
countable subcover. 


Proof. Consider the set of all points in R” which have rational coor- 
dinates. That is a countable set. For each point P with rational coordinates, 
consider the set of open balls about P which have a rational radius. The 
collection of all those open balls, as P ranges over all points with rational 
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coordinates, is a countable set. Why? Because we just described it as a 
countable union of countable sets. (See Appendix.) Thus, there is a sequence 
of open balls B,, B,, B;,... such that every open ball with rational center 
and rational radius is one of the sets B.. 

Now, let {U,; ἃ Ε A} be an open cover of a set S. For each positive 
integer n, ask whether or not the ball B, is contained in one of the sets 
U,. If it is, choose one index «, so that 
(2.26) Boe Ue 
The claim is that the sequence of sets U,, covers S. (Note that a, is not 
defined for every n. If that bothers you, define ἅν, any way you please for 
the n’s where we did not define it.) Let ¥ © S. There exists an ἃ such that 


X € U,. Choose an open ball B(X; r) < U,. Let P be a point with rational 
coordinates such that 


[P= ΧΙ - ΚῬἐν. 
Choose a rational number ¢ such that 
(2.27) |[P—X|<t<f4r. 


From (2.27) it follows that 
X € BP; ἣ « BCX; r). 
Now B(P; ἢ) = B, for some n. Therefore that B, contains ¥ and is inside 


the set U,. Since B, is contained in some U.,, we have the associated set 
U,,, which contains B, (2.26). Hence 


Xe U,. 


Every point of S is in one of the sets U,,.. 


The significance of Theorem 11 should be reasonably evident. If we 
wish to prove that a set is compact, we need only prove that it is “coun- 
tably compact”, i.e., that each countable open cover of it has a finite 
subcover. We shall use this shortly to prove that each closed and bounded 
subset of R” is compact. We first digress a bit to describe compactness in 
terms of closed sets. 


Theorem 12. Let K be a compact set. If {K,;a Ε A} is a family of 
closed subsets of K and if, for each finite set of indices &,,..., G,, the inter- 
section K,, \ -+- © Κα, is non-empty, then the intersection 


()\K, 


acA 
is non-empty. 


Proof. Let {K,} be any collection of closed subsets of K. For each 
index «, let U, be the (open) complement of K,. Suppose that the intersec- 


tion 
() K, 
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is empty. This means that its complement 
UU, 


contains K, i.e., that {U,} is an open cover of K. Since K is compact, there 
are indices a,,..., a, such that 


But then 


is empty. Therefore, if the family {K,} is such that this last phenomenon 
cannot occur, the intersection of all the closed sets K, must be non-empty. 


In analysis, it is important to understand compactness via closed 
sets. That is why Theorem 12 is called a theorem, in spite of the fact that 
it is just the rewording of the defining property of compact sets obtained 
by using complements of open sets rather than the open sets themselves. 
The usefulness of the reformulation stems from the fact that compactness 
becomes a tool for proving that things exist: “There exists a point which 
is in every K,.” We shall repeat part of the reasoning as we prove the most 
important theorem about compactness. 


Theorem 13 (Heine-Borel). In R™, every closed and bounded set is 
compact. 

Proof. This is the union of Theorem 11 and the Bolzano-Weierstrass 
theorem. Suppose K is closed and bounded. Theorem 11 tells us that, in 
order to show that K is compact, we must show that a countable open 
cover {U,} has a finite subcover. Given such a cover, let 


K,={XeK;X €U,1<k <n. 
The sets Καὶ, are nested 
Ky > Κα, > Ky --.. 
Since K is closed, each K, is closed. Since K is bounded, each K, ((.6., K;) 
is bounded. If each K, were non-empty, Bolzano-Weierstrass (Theorem 9) 
would tell us that the intersection of all the K,’s was non-empty. But, that 


would mean that the sequence {U,} did not cover K. Therefore, some Καὶ, 
must be empty. If K, is empty, then U,,..., Uy cover K. 


Exercises 


1. Which of the following sets of complex numbers are compact ? 


(a) all z such that |z| > 1; 
(b) all z such that zz* = 2; 
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2; 


(0) all z such that [Ζ| is rational and |z| - 1; 
(4) all z such that 62 = 1. 


What is the least number of open intervals of length € which will cover the 


interval [0, 1]? 


3. True or false? If K is a set of complex numbers for which the orthogonal 
projections onto the real axis and the imaginary axis are both compact, then 
K is compact. 


4. True or false? If S is a bounded set, there is a smallest compact set which 
contains S. 


5. Every bounded subset of R” can be covered by a finite number of open balls 
of radius €. 


6. True or false? If A is closed and B is bounded, then A - B is compact. 


7. If A and B are compact, then the Cartesian product A x B is compact. 


8. True or false? If the set S has a largest compact subset, then SS is compact. 


9. True or false? If the boundary of S is compact and if the interior of S is 
compact, then S is compact. 


10. Give an example of a countable compact set which has a countably infinite 
set of cluster points. 


11. If a compact set has only a countable number of cluster points, it is a count- 
able set. 


12. Which of the following sets of k x k (real) matrices are compact ? 


13. 


14. 


(a) all A such that 4 = 4°; 
(b) all A such that A is orthogonal; 
(c) all A such that A? = 0; 
(4) all A such that 42 = A. 


If {J,} is a sequence of open intervals which covers [0, 1], then 
> length(/,) > 1. 


If A and B are subsets of σι, define 
A+ B={X+ Y¥;X e€ Aand Ye Bh. 
(a) If A is any set and U is open, then A + U is open. 


(b) If A is closed and B is compact, then A + B is closed. 
(c) Give an example of closed sets A and B such that A -+ B is not closed. 


2.8. Relative Topology 


In many situations, the space in which we operate will be a subset 


of R”, rather than the entire Euclidean space. For instance, we may want 
to discuss functions which are defined only on an interval [a, δ] of the real 
line. In that case, the interval will become a temporary subuniverse, in 
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which we talk about points being near to one another, convergence, open 
sets, closed sets, etc. Such concepts in the “sub-space” will be defined 
by simply disregarding the remainder of R™. 

Fix a set Sin R”. If X is a point of S, a neighborhood of X relative to 
Sis a set 7 such that 


(i). 7- ὁ; 
(ii) T contains Β(Χ; r) Τὴ S for some r > 0. 


In other words, a neighborhood of X relative to S is a subset of S 
which contains every point in S which is sufficiently close to X. 

Let V be a subset of S. We say that V is open relative to S if Vis a 
relative neighborhood of each of its points. Thus, V is open relative to 
S if 

V=SOT7 


where U is an open set in R”. 

Let F be a subset of S. We say that F is closed relative to S if F con- 
tains every cluster point of F which is in the set S. Evidently, the following 
conditions on a set F c S are equivalent: 


(i) Fis closed relative to S. 
(ii) F = ϑ' ὦ Καὶ where K is closed in R”. 
(iii) If {X,,} is a sequence in F which converges to a point X in the set 
S, then X is in F. 
(iv) The complement of F relative to S, S — F, is open relative to S. 


There is now a string of obvious comments. The family of sets which 
are open relative to S is closed under the formation of arbitrary unions 
and finite intersections. The family of sets which are closed relative to S 
has the complementary property: arbitrary intersections, finite unions. 
The set S is both open and closed relative to S. If {X,} is a sequence in S, 
then {X,} converges to the point XY <€ S if and only if each neighborhood 
of X relative to S contains X, except for finitely many n’s. 

We could now define interior relative to S, closure relative to S, etc. 
We shall not make much use of those terms. One further relativization 
deserves comment. A set is (of course) compact relative to S if each cover 
of it by relatively open sets has a finite subcover. ΖΓ Καὶ is a subset of S, then 
K is compact relative to S if and only if K is compact. That is trivial to verify; 
however, it tells us something. Vaguely, it says that compactness is an 
intrinsic property of (compact) geometric objects. 


EXAMPLE 22. Most often, we shall deal with relatively open, etc., when 
the set S is either an open set or a closed set. Suppose that S is an open 
subset of πὶ Then, “open relative to S” merely means “open set contained 
in S”. “Closed relative to S” is more interesting. For instance, the interval 
(a, c] is closed relative to the interval (a, δ), a<c <b, 
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If S is a closed subset of R”, then “closed relative to S” becomes 
“closed subset of S”, whereas, “open relative to S” is more interesting. The 
interval (5, 1] is open relative to the interval [0, 1]. The set {0} is open 
relative to the set S = {0, 1} which consists of the numbers 0 and 1. 


EXAMPLE 23. Suppose that S is a linear subspace of R”. Let us say 
k = dim (S). Then S is a closed subset of R”. Also, S intrinsically looks 
like R*. We can decompose each _X in R” into a sum 
X=YiZ 


where Y is in S and Z is orthogonal to (every vector in) S. The vector Y 
is the orthogonal projection of X on S: 


P 
R*’ —>S 
Fr =7tX): 


One can see easily that the set V < Sis open [closed] relative to S if and 
only if 
PV) = {X; P(X) ε V} 


is open [closed] in R”. This is illustrated in Figure 8. 


FIGURE 8 


One geometric property which is conveniently defined via relatively 
open (or relatively closed) sets is connectedness. If S is a subset of R”, 
we want to say what it means for S to be connected. In Figure 9, there is 
a set S which probably we would all describe as disconnected. The figure 
eight, called F, does not appear disconnected, so probably we would call 


S=S,U So F 


OO 


FIGURE 9 
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it connected. But suppose we let G be the figure eight with its middle point 
removed. Is that a disconnected set? Perhaps it ought to be so described, 
since it falls apart so naturally into two pieces. But, how is that different 
from decomposing F into two pieces, one being the left and the other being 
the right half plus the midpoint? Here is the way in which we shall dis- 
tinguish. 


Definition. The set S is disconnected if there exist two subsets 5, and 8, 
with these properties: 


(i) S, and S, are non-empty; 8, OS, = @; 
(il) S=8, US,; 
(11) S, and S, are closed relative to 5. 


If S is not disconnected, then S is connected. 


Observe that the following conditions on a set S are equivalent: 


(i) S is connected. 
(ii) If S= U, U U, where U, and U, are disjoint relatively open 
subsets of S, then either U, or U, is empty. 
(iii) If 7 is a subset of S which is both open and closed relative to S, 
then either 7 = S or T is empty. 


In particular, an open set S in R” is disconnected if and only if S = 
U, U U, where U,, U, are disjoint nonempty open sets in R”. Similarly, 
a closed set S is disconnected if and only if S = K, U K,, where K,, Καὶ, 
are disjoint non-empty closed sets in R”. 

We shall not expend much energy worrying about the connectedness 
of sets which are neither open nor closed; however, it may increase our 
understanding of the concept if we prove the following. 


Theorem 14. Let S be a subset of Ἀπ. If S is not connected, there exist 
open sets U,, Ὁ, in R™ such that 


(a) Sc U, UU,; 
(b) Sa U, τῷ empty 4S U,; 
(c) U, and U, are disjoint. 


Proof. The fact that S is not connected provides us with open sets 
U,, Ὁ, which satisfy (a), (Ὁ) and 


(c)’ SM U, ὦ U, is empty. 


But U, and U, may intersect at points outside of S. We must show that 
we can shrink U, and U, a bit, so that the new open sets become disjoint 
but (a) and (b) are not disturbed. As a matter of fact, the shrinking can be 
done so that neither ὦ, ™ S nor U, \ S is changed, and the remainder 
of the proof has nothing whatever to do with S. 
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Let 
T, =U, —U, 
T, = U, — U;. 
If X © T,, then X ε U, and so there is a positive number r,(X) such that 
BUX; 7, (X)) < ὺᾧ.. 
Similarly, if Y Ε 7, then Y € U, and 
BY; r(Y)) c U, 
for some r,{Y) > 0. Let 


| 


Vi = U BAX) 


XE 


V,= ΒΩ; $r.(Y)). 


Yer 


Ld 


Then Κ᾽ is an open set which contains Τ᾽. (Therefore, V; 0 S= U, XM 5.) 
Furthermore, V, and V, are disjoint. If 


ZEV,NP, 
then, for some X ε Τὶ and Y ε T, 
|Z — X| < 4r,(X) 
|Z — Y| < 4r,(Y). 


One of the numbers r,(X), r,(Y) is at least as large as the other, say, r,(X) 
> r,(Y). Then 


IX-Y¥(<|Z—xX|+1¥—-Z 
< 4ri(X) + $r(V) 
<r,(X). 
But that means that Y is in U,, whereas, Υ € (Ὁ, — U,). So V, and V, 


are disjoint. 


Corollary. If K, and K, are disjoint closed sets in R™, there exist 
open sets V,, V, such that K, < V, and V, and V, are disjoint. 


Proof. This is the case of the argument above in which S = K, U 


K,, U, is the complement of K,,:and U, is the complement of K,. 


Exercises 


1. In the set of integers, every subset is relatively open (and relatively closed). 
2. Every interval in & is connected. 


3. If S is a connected subset of R, then S is empty, S contains precisely one 
point, or S is an interval. 
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4. Every linear subspace of R” is connected. 


5. If K is an infinite compact set, then K has a subset which is not closed rela- 
tive to K. 


6. Let S be a bounded open set which is connected. Show that any two points 
in S can be joined by a polygonal path which lies in S, i.e., if A, B « .S, there are 
points A = Xy, X;,..., X, = B such that (for each k) the line segment from 
Xx, to X;, lies in S. (Hint: Fix A, and consider the set of points B which can be 
so joined to A. Show that the set is both open and closed relative to S.) 


7. Every convex set is connected. (A set is convex if it contains the line segment 
joining each pair of its points.) 

8. If S is connected and if S <c Τ' «- S, then T is connected. In particular, the 
closure of a connected set is connected. 

9. True or false? The interior of a connected set is connected. 
10. True or false? The boundary of a connected compact set is connected. 


11. Let K be the Cantor set (Example 15). Show that no connected subset of K 
contains more than one point. 


12. From R”, remove a hyperplane (level set of a linear functional). The remain- 
ing set is disconnected. 


13. From C”, remove a hyperplane (level set of a complex-linear functional.) 
The remaining set is connected. 


14. The set of invertible & Χ k matrices with real entries is not a connected sub- 
set of R***, . 


15. The set of invertible k Χ k matrices with complex entries is a connected sub- 
set of C***, 


16. In R”, m > 2, spheres are connected. 


17. Consider a polynomial p(x, y) in two (real) variables, p 4 0. Let Καὶ = {(x, y); 
p(x, y) = 0}. 

(a) Give an example where K is empty. 

(b) Give an example where K contains just one point. 

(c) Give an example where K is not compact. 

(d) If K contains more than one point, can the complement of K be con- 
nected 7 


18. Let S be a subset of R”. If X © S, let Sy be the union of all connected sub- 
sets of S which contain XY. Then Sy is (non-empty and) connected. If Sy and Sy 
have any point in common, then Sy; = Sy. The distinct sets Sy are called the 
connected components of S. 


*19. If K is a compact set and X ε K, the connected component of K which 
contains X is the intersection of all relatively open-closed sets which contain YX. 


3. Continuity 


3.1. Continuous Functions 


The concept of continuity permeates mathematics. It can be formu- 
lated in great generality; however, for the time being, we shall deal with it 
in the context of maps from one Euclidean space into another. The reader 
may wish to refer to the Appendix, to review the basic notation and termi- 
nology concerning functions. 

We consider a function F 


F 
(3.1) D—>R", DcRé 


which is defined on a subset of R* and which maps that set into R”. 
Roughly speaking, we say that F is continuous if F preserves proximity: 
If X, 1s near X,, then F(X,) is near F(X,). There is a variety of ways to 
make this precise. We shall look at several of these ways and show that 
they are equivalent. In the one which we choose as the definition, we shall 
describe proximity by open sets, because this lends itself readily to subse- 
quent generalization. 


Definition. The function F (3.1) is continuous if, for every open set V 
in R™, the inverse image F~'(V) is open relative to the domain of definition 
D. 


Does this say that a continuous function sends points which are close 
together into points which are close together? In a sense, it does. Let us 
see why. Suppose F is a continuous function, and let X, be a point in the 
domain D. Let’s look at the set of all points which are near the image 
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point F(X,), i.e., a neighborhood of F(X,). What we have in mind is a 
“small” neighborhood; so, let us consider an open ball 


B. = B(F (Xo); €). 
We assert that every point of D which is sufficiently near to Χο is mapped 
by F into this neighborhood B,. Why? The definition of continuity tells 
us that 
F-(B.) ={X € D; F(X) ε B.} 
is open relative to D. Therefore, this set contains the intersection of D 
with some open ball about X,: 


(3.2) DO BX; δ) « ΕΓ (δ). 
In other words, (3.2) says 
| F(X) -— F(X,)| - ε, XED, |X —X,|< ὃ. 


Continuity has told us that, if we are given a point X, € D anda number 
€ > 0, there exists some number 6 > Ὁ such that every point of the domain 
which is within distance 6 of X, is mapped by F into a point within distance 
ε of F(X,). This property characterizes continuous functions, as we shall 
state formally in a moment. 

We might note that most of the preceding discussion dealt only with 
how F behaved near a single point Χο. 


Definition. The function F is continuous at the point X, in D provided 
this condition is satisfied: If V is a neighborhood of F(X,) in R™, then the 
inverse image F~'(V) is a neighborhood of Χο relative to D. 


It is easy to see that F is continuous if and only if F is continuous at 
each point of its domain D. Let us record what we observed about e’s and 
6’s and add a bit more information. 


Theorem 1. Let X, be a point in D, the domain of the function F. The 
following are equivalent: 


(i) F is continuous at the point Χο. 
(ii) For each € > 0, there exists a number ὃ > 0 such that 


| F(X) — F(X) 1 < € 


for every point X © Ὁ with|X — X,| - ὃ. 
(iii) If {X,} is a sequence in D which converges to Xo, then the sequence 
{F(X,,)} converges to F(X). 


Proof. We noted that (i) implies (ii). Suppose (ii) is satisfied. We 
shall verify (iii). Let X, Ε D and X, = lim X,. Consider an open ball 


B(F(X,); €). From (ii) we obtain δ. > 0 such that 
(3.3) | F(X) — F(X)| < €, XED, |X —X|< ὃ. 


Sec. 3.1 Continuous Functions 


Since Y, converges to X,, we have |X, — X,| < ὃ except for a finite num- 
ber of n’s. Thus, 
| F(X,) — ΙΧ) < € 

except for a finite number of n’s. Since this is true for every ε > 0, the 
sequence {F(X,)} converges to F(X,). 

Now, suppose condition (iii) is satisfied. We shall verify (i). Let V be 
a neighborhood of F(X,). Is F-1(V) a neighborhood of Χο relative to D? 
If it is not, then for each n we can select Y, € D with |X, — X,| < 1/n 
but X, ¢ F-\(V). Then {X,} converges to Χο. By (iii), {F(X,)} converges 
to F(X,). Thus, since V is a neighborhood of Χο, some F(X,) is in V. But 
X, ¢ ΕΠῚ (7), so we cannot have that situation. Conclusion: F~1(V) must 
be a neighborhood of X, relative to D. 

Since we have shown that 


(i)—(i) 
Lee 
(1) 
it is clear that if F possesses any of the three properties, it possesses the 
other two as well. 


In order to deal successfully with continuity, there are two things 
which we must acquire rapidly: 


1. a basic list of continuous functions; 
2. methods for combining continuous functions to produce new con- 
tinuous functions. 


We also need to look at some discontinuous functions, in order to sharpen 
our understanding of the definition of continuity. Before we turn to ex- 
amples, let us take note of the most fundamental method for combining 
continuous functions. 


Theorem 2. Let F be a function from (a subset of ) R* into R™ and let G 
be a function from (a subset of) ἘΠ᾿ into R® such that the domain of G con- 
tains the image of F. If F is continuous at the point X, and if G is continuous 
at the point F(X,), then the composition G © F is continuous at Xp. 


Proof. The statement of this theorem is as long as its proof. Let W 
be a neighborhood of (Go F)(X,) = G(F(X,)). (See Figure 10.) Then 
G~?(W) is a neighborhood of F(X,) relative to D,, the domain of G. There- 
fore G-1(W) = V ~ Dg, where V is a neighborhood of F(X o) in R™. Since 
D, contains the image of F, F-1(V ὦ D,) = F-1(V); hence, F-1(G-1(W)) 
is a neighborhood of X, relative to D,. But F~'(G~1(W)) = (Go F)-1(W), 
(See Figure 10.) 


The reader should carry out the proof of Theorem 2 using the refor- 
mulations of continuity contained in Theorem 1. For example, if {X,} is a 
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G(F (X9)) 
FIGURE 10 


sequence of points in the domain of F which converges to X, € Dry, 
then {F(X,)} converges to F(X,) because F is continuous. Accordingly 
G(F(X,)) converges to G(F(X,)) because G is continuous. 


A function 
F 
ἢ ---» Κα 
may be regarded as an m-tuple of real-valued functions: 
F=(f,,.-- 7.) 
Y = F(X) 


γι ΞΕ (Χο ἐφ Ὁ) 


Vin =finlX1 ee | Xx) 
The function f, is the composition of F with the jth standard coordinate 
function on Κα, that is, f;(X) is the jth coordinate of the point F(X). The 
function F = (f,,...5Jm) is continuous if and only if each f, is continuous. 
We leave the proof as an exercise. 


EXAMPLE 1. Vector addition is a continuous function. Of course, 
vector addition in R” is not a function on R”; it is a function on R2”. 
Think of R2" as R™ x R”, each 2m-tuple being a pair (X, Y) where X is 
the m-tuple of the first m coordinates and Y is the m-tuple of the last m 
coordinates: 

XS Ce es ae a 


Addition is the function 
R2m > R™ 


AX, Y)=X+Y. 
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We verified the continuity in sequential form in the lemma of Section 2.1. 
The reader should also ask himself what the largest 6 is for which the ball 
of radius ὃ about (XY, Y) is contained in the inverse image of B(X + Y; ε). 

Now, it is easy to see that the sum of two continuous functions is con- 


tinuous. Suppose 
F 
D—»> R” 


G 
D——> R’, Dc R*. 
Then (F + G)(X) = F(X)+ G(X). The fact that F-+ G is continuous is 
clear, by the same argument which shows that addition is continuous. In 
fact, the continuity of F + G can be deduced from the continuity of addi- 
tion, as follows. Let F x G be the function 
FXG 
D τς ον, R2™ 
(F Χ ΟΧΧῚ = (F(X), G(X)). 
If F, G are continuous, obviously F x G is. Observe that F+ G= 
A o(F X G), where A is addition. By Theorem 2, F + G is continuous. 


EXAMPLE 2. Matrix multiplication is continuous. Multiplication of 
real k x k matrices is a function from Euclidean space of dimension 2k? 
into Euclidean space of dimension k?: 


Rkx2k ΞΕ ias R*x* 

M(A, B) = AB. 
We verified this continuity in sequential form in Example 2 of Chapter 2. 
Similarly, the product of two continuous matrix-valued functions is con- 
tinuous. The matrices may have complex entries as well as real ones. In 


particular, the product of two continuous real-valued (complex-valued) func- 
tions 1s continuous. 


EXAMPLE 3. Every polynomial function on R” is continuous. Such a 
function has the form 


(3.4) (Och). ΟΣ Mie ee 


where the a, 
tions 


peony 


F(X) = x; f<jsn 
are continuous: 


((Χ) — f(Xo)| < |X — Xl. 
Of course, constant functions are continuous. Consequently, by repeated 


application of the fact that sums and products of continuous functions are 
continuous, we see that every polynomial function is continuous. 


δ] 
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EXAMPLE 4. Every polynomial function on ΟἽ: 


(3.5) FC ece SE) SS i eee ee ΖΝ 
is continuous, by the same reasoning. 


EXAMPLE 5. Inversion is continuous, on the set of invertible matrices. 
Refer to Example 14 of Chapter 2, if necessary. Let A be a fixed k x k 
matrix which is invertible. If |A — B| <|A-'|-', then |J — A-*B| < t, 
so that 

BA = [I —(1— AB) = Σ α -- 4518} 
n=0 
B24 = Se By 


n=1 


[BOY — 4-||-Ξ 4 ΟΣ} -- ΑΓ 18}. 
π-ὦ} 


Now 
1-- AVB|<|A™'||A — B| 

so that 

[Bo — A << [At] APA — ΒΡ 

μ-- 1 

that 1s, 

πῶς Benepe a AS Bl. 
(3.6) [BP ASA τς πε 
It is then clear that | B-! — 41] 15 small when | A — B| is small. 


The reader to whom the continuity is not clear from (3.6) may wish 
to carry the detailed reasoning a few steps further. Given an invertible 
matrix A and a number ε > 0, we wish to find ὃ > 0 such that} B-' — A7!| 
«: ε for all matrices B which satisfy [8 — ΑΙ < δ. According to the in- 
equalities which we just derived, this will be so provided 


ὃ -:.[41[! 


The largest 6 which has these properties is the one for which equality 
holds in the second condition: 
ΟΝ ΤΙ αν, 
3. ἐν 5] 

As a special case of the continuity of inversion, if fis a real or complex- 
valued function, then 1/f is continuous at X if f is continuous at X and [(Χ) 
+ 0. This tells us that rational functions (quotients of polynomials) are 
continuous off the zero sets of their denominators. 

If one knows that inversion is continuous on non-zero numbers (1 X 1 
matrices), it follows that rational functions are continuous where defined. 
One can conclude from that fact that inversion of matrices is continuous: 
The entries of 4: are rational functions of the entries of A, because the 
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i, jentry of 4“! is 
det A( j |i) 

det A 
where A(j|i) is the (k — 1) x (Κ — 1) matrix obtained by deleting the 
jth row and ith column of A. But, a careful proof that inversion is con- 
tinuous on the set of non-zero numbers is not appreciably simpler than 
the argument we gave for matrices. 


(—1y4 


EXAMPLE 6. Let D be the set of non-negative real numbers. Let n be 
a positive integer. The nth root function 


f(x) = xt 
is continuous. Fix a positive number t. We want to show that x!” — ¢1/ is 
small provided x — ¢ is small. In order to compare the sizes of these two 
numbers we let a = t'!” and b = x!" and compare δ' — a’ to b — a. Now 


δ" — α" = (ὃ — a)(b""! + ba 4+ ++ + a"). 
Thus, 
|b" — απ] [ὃ — alnc"™! 


where c is the smaller of a and ὁ. If we keep x sufficiently near ¢ so that 
x > t/2 then we shall have 
(n—-1)/n 
|x — t] >| xt* — 1|-n(3) 1 
or 


ἐῶ ΐ 
xi/n _ μι (+) D/nly — tl, nae 
| [ -Ξ a [x —t| x >> 


From this inequality it should be apparent that f(x) = x!” is continuous 
atx =f. 


EXAMPLE 7. Every linear transformation from R* into R” is con- 
tinuous. A linear transformation is a function 


R* —-» R™ 


such that 7(cX, + X,) = cT(X,) + T(X,). The defining property extends 
immediately to the fact that a linear transformation “preserves” linear 
combinations 


(3.7) T(c,X, + +++ +.0,X,) = c,T(X,) + --. +.0,7(X,). 


Such transformations are exceedingly special functions. We can describe 
them all immediately. Let E,,..., E, be the standard basis vectors: 


Ε, = (1,0,...,0) 
E, = (0, 1,...,0) 


E, = (0,0,..., 1). 
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Then X = x,E, + --:+ x,&,, so, if Tis linear we have 
T(X) = x, T(E,) + +++ + x, TCE,). 


How do we describe all linear transformations from R* into R”? 
Pick any k vectors Y,,..., Y, in R™ and define 


T(X) = χι ῦ, + +++ + x, Yj. 
Then Tis linear and Y, = T(E,). Furthermore, every linear transformation 
has that form. 
From the specific form of a linear transformation its continuity is 
apparent. If Y, = T(E,), then 
[T(X)| <x] Pil + τὸν + lx ll Leal. 
By Cauchy’s inequality 
[T(X)|< M|X| 
where 
M=(Y, [2 + +++ 4- | Y,.|2)1/2. 
Since 7 is linear, 
| T(X) — T(Xo)| = | TX — Xo)I 
<M|X — X)|. 


Exercises 


1. The function f, f(x) = x?, is continuous from R into R. Find an explicit 
formula in terms of ¢ and € for the largest 6 such that 


“αὐ <6. [xt] <2. 


2. Let g be the greatest integer function, i.e., g(x) is the largest integer < x, 
x € R. At which points is g continuous? 


3. Let f be the real-valued function on ΚΖ, 
age = = 2 2 
POG) ait x +y #0 
f(, 0) = 0. 


Show that fis not continuous at the origin. On which lines in R?2 is f continuous ? 


4. Let f be the function of Exercise 3. Describe explicitly all open sets V < καὶ 
for which f-!(V) is open in R?. 


5. Let 
R* —> R 
be a continuous function. Suppose that f assumes only rational values. Show 
that f is constant. 
6. Let f be the real-valued function on R* 


f(X) = max x;, 


i.e., f(X) is the largest coordinate of X. Is f continuous? 
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7. Length is a continuous function on R*. 
8. If F is continuous, then | F'| is continuous. 


9. If fand g are continuous real-valued functions on D then the minimum of 
fand g is continuous: 


h(X) = min {f(X), g(X)}. 
10. Conjugation is continuous on the complex numbers. 
11. The function 
f 
D—~>R 
is continuous if and only if the sets {X; f(X) > ἢ and {X; f(X) « δ) are open 
relative to D (for every ¢ € R). 
12. The function /: 


fo =7 


is continuous on the set of complex numbers z ~ 1. 
(a) Now f(0) = 1. What is the largest ὃ such that 
1 - Κῶ] τῷ, [2] «δὴ 
(0) If V = {z ε C; Ἐς (z) > 0), what is f-1(V)? 


13. If D is an interval on the real line, a convex function on D is a real-valued 
function f such that 


f(x + ( — thy) < tf) + 1 — 1) f0) 
for all x, y Ε Dand all ¢ ε [0, 1]. 
(a) Interpret the convexity of fin terms of line segments and the graph of ἢ 


(b) Give an example of a discontinuous convex function. 
(c) Prove that every convex function on an open interval is continuous. 


14. Let D be a convex subset of Ré, i.e. a subset such that the line segment from 
X to Yis in D whenever X and Y are in D. 


(a) Define convex function on D. 
(b) Prove that, if D is open, every convex function on D is continuous. 


15. Let 7 be a function from R* into R” which is additive: TY, + X,) = 
T(X,) + T(X,). If T is continuous, then T is a linear transformation. 
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3.2. Continuity and Closed Sets 


The relationship between continuous functions and closed sets 
deserves special comment. This little section contains assorted examples 
and remarks which are elementary but which may increase our under- 
standing of continuity. 

Suppose that we have any function 


F 
D —»> καὶ Dc κ΄. 
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If V, and Κ᾽, are subsets of R”, then 
FUV, OV.) = FV) Εἴ) 
FUV, ὦ V2) = FV) ἡ F-1(V3). 


These two properties of inverse images are basic, in spite of the fact that 
they are trivial to verify. The second property should not be confused with 
F(U, A U,) = F(U,) A F(U,), which is generally false. From the two 
properties, it should be apparent that F is continuous if and only if F~'(K) 
is closed relative to D, for every closed set K in R™. In particular, if F is 
continuous and K consists of the single point Y, in R”, then F-1(K) = 
{X € D; F(X) = Yo} is closed relative to D; that is, each level set of a 
continuous function is closed (relative to the domain of the function). 

If f is a continuous real-valued function on R*, then {Χ; f(X) > 0}, 
{X; —1<f(X) < 3} and {X; f(X) = 0} are closed subsets of R*, while 
{X; f(X) > 0} is an open set. These are things which a student of analysis 
must recognize without hesitation. 

We have not cited many specific examples of continuous functions. 
We do know that polynomial functions and a few other functions are 
continuous, and they provide us with some interesting examples of closed 
sets. 


EXAMPLE 8. Each linear function on R* is continuous. Therefore, 
every hyperplane (level set of a non-zero linear function) is a closed set. 
Each linear subspace of R* is an intersection of hyperplanes (through the 
origin), and so, each subspace is a closed set. 

The determinant function is continuous on the space of k x k ma- 
trices (real or complex entries), because it is a polynomial function of the 
entries. Thus, the set of singular matrices (those with det A = 0) is a 
closed set, and the set of invertible matrices is open. In the real case, the 
set of invertible matrices is disconnected by the determinant function, 
since it is the union of {A; det A > O} and {A4; det A < 0}. 

A k x k matrix is called orthogonal if AA‘ = J, where A’ denotes 
the transpose of A (obtained by interchanging the rows and columns of 
A). The set of orthogonal matrices is a closed subset of R*** or C***, 
because it is a level set of the continuous function 


F(A) = AA. 
In the real case, the set of orthogonal matrices is compact, since 
JAP =k 
for a real orthogonal k x k matrix, and this bounds the set. 
At times, it is useful to know that every non-empty closed set in R* 


is a level set of a continuous function. Let E be a non-empty closed set in 
R*. For each X, consider the distance from X to E: 
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dX, E) = inf {|X — Y|; Y © E}. 


This can be defined for any non-empty set £, but evidently d(X, E) = 
d(X, E) so we restrict ourselves to closed sets E. We note two things. 


(i) d(X, ΕἸ = Oif and only if XY ε E. 
(ii) There exists (at least one) Y € E such that 


Property (i) is essentially the definition of closed set. Property (ii) follows 
from this observation. Given X, choose any point Z in E. Evidently, the 
points of E which are nearest XY are found in the closed ball B — 
B(X; |X — Z|). Therefore, 


d(X, E) = inf {|X — Y|;Y ε BO ΕἸ. 


The function f( Y) == |X — Y| is continuous on the compact set 8 Ὁ Ε. 
Thus its infimum is attained at some point Y. (There may be more than 
one such Y.) 

The function “distance to E” is continuous because 


|a(X,, E) — α(Χ,, E)| < |X, — X1|. 


Theorem 3. Let E and K be disjoint closed subsets of ἈΞ. There exists on 
R* a real-valued continuous function f such that 


GQ) O0<f<]; 
(ii) K = {X; ((Χ) = 1}; 
(11) E = {X; f(X) = 0}. 
Proof. 1ἴ K and E are empty, there is nothing to prove. If K is empty 
but Εἰ is not, take 
_ _KX, E) 
IO) = Τ ΙΧ, Bj 
and do a similar sort of thing if E is empty while K is not. If they are both 
non-empty, take 
= AX, ΕἸ 
IO = GOK) + ἄχ, ΕἸ 


This function has properties (i), (ii), and (iii). 


Theorem 3 is frequently used in the following way. We are given a 
closed set K and an open set U which contains K. We want to find a con- 
tinuous function f such that 0 < f< 1, f= 1 on Καὶ and f = 0 outside U. 
By taking E = R* — U in Theorem 3, we see that there is such a function. 

The distance function d(X, E) measures the extent to which a point 
X can be approximated by points of £. The fact that there existsa Yc E 
such that | ¥ — Y| = d(X, Y) says that there exists a point of E which is 
a best approximation of X by points of E. There is no reason to think that 
Y is unique. Many points of E might provide a best approximation. There 
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is an important class of sets, closed convex sets, for which there is a unique 
best approximation. These sets occur in various applications, so let us 
say a little bit about them. 

The set K is convex if, for each pair of distinct points Δ΄, Y in Καὶ, the 
line segment between YX and Y is in K. This means that, if ¥, Y are in K, 
then 

(¢X + (1—dY) ε K, 0O<r<l. 
Closed balls and linear subspaces are examples of closed convex sets. 
Half-spaces are the basic closed convex sets. A (closed) half-space in R* 
is a set H of the form 
H = {X; f(X) <¢} 

where c is a real number and 5715 a non-zero linear functional on R*. The 
form of such a functional is 


T(X) = αιἰχι + +++ + a,x, 
where a,,..., a, are real numbers, not all 0. The hyperplane { f= c} 
determines two (closed) half-spaces, {f<c} and {f>c}={—f< —c}. 
So, a half-space is determined by a linear inequation (inequality) 
ὥυχὶ τι Pe oh aXe SC 
just as a hyperplane is determined by a linear equation 
A,X, + +++ τ AX, = C. 

In Chapter | we discussed the “linear” or flat subsets of R* (lines, 
planes, etc.). These can be described by the following equivalent con- 
ditions: 

(a) If X, Y are in K, the line through X and Y is in K. 

(Ὁ) K is an intersection of hyperplanes, i.e., Κα can be described by a 
system of linear equations. 

(c) K isa translate of a linear subspace of R*. 

We want to see that the following conditions on a closed set K are 
equivalent: 

(a) K is convex. 


(Ὁ) K is an intersection of half-spaces; 1.6., K can be described by 
a (possibly infinite) system of linear inequalities. 


Theorem 4. Let K be a closed convex set in ἈΠ. If X ε RK‘, there is a 
unique point in K which is nearest to X, 1.e., there is one and only one point 
Y in K such that 

ΙΧ — Y| = d(X, K). 


Proof. If |X — Y| =|X — Z|and if Y ~ Z, then $(Y + Z) is closer 
to X than either Y or Z. If one wishes to verify that analytically, just do 
so in the case XY = 0 and use the parallelogram law: 
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\Y—ZP = 2({ 2} + [Ζ|2 --Ἰ Υ + Z2P 
= Ζ(Υ 12 + |Z?) — 414(¥ + Z)/. 


(You may feel that use of the triangle inequality is simpler.) 


In case K is a linear subspace, the nearest point to X is the orthogonal 
projection of X onto K. (See Chapter !.) In fact, any closed convex set K 
defines a “projection.” Define a function 

Px 
R*‘ —> κα 
by 
Y=P,(X) if YeK and |X — Y|= dX, K). 


Since Y ts the unique best approximation to X by elements of K, the func- 
tion P, is well-defined. Several points X, together with their corresponding 
best approximations in K, are shown in Figure 11: Y, = P,(X,). We also 
have 


(a) P, is continuous; 
(b) Px(Px(X)) = P(X); 
(c) P,(X) = X if and only if Y¥ ε K. 


FIGURE [1 


Take any X in R* whichis not in K. Let Y = P,(X). Then K is entirely 
on one side of the hyperplane through Y orthogonal to ¥ — Y. (We have 
left the proof as an exercise.) That hyperplane is 


Y+(X —Y)t ={Z2;<Z,X — Y =<yY,X — YD}. 
In other words, the hyperplane is 
{Z; f(Z) = f(Y)} 
where / is the linear functional “inner product with XY — Y”: 
I(Z) = <2, X — Y>. 
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Either K is in the half-space where f< f(Y) or the half-space where 
f>=>f(%), according as f(Y) > f(Y) or f(X) < f(Y). 


Theorem 5. Every closed convex set is an intersection of half-spaces. 


The only property of the projection P, which we asserted but which 
is not obvious is the continuity. We end this section by indicating how this 
can be verified. If Y = P,(X) we want to choose ὃ > 0so that | Y — P,(Z)| 
is small for all vectors Z such that | ¥ — Z| < δ. Suppose 6 > 0 is small 
and Z satisfies | ¥ — Z| < ὃ. Let W = P,(Z). We obtain the following 
inequalities for the reasons indicated at the right. | 


[Aa Wl eX 550] Y = P,(X) 

XS Κγῦ,|,-:-|Ζ = Ψ͵|: δ, |\X¥—Z|<6 
SZ --ὐ τεῦ, W = P,(Z) 
2: ΠῚ Ge Υ] -926. |X -- ΖΙ --ὃ. 


Therefore, 
ΙΧ --Υ{-:-|Χ —~-W\|<|X — Y|-+ 20. 

If δ is small, does this imply that W is close to Y? No, but it does if we 
add the one other piece of information we have, which 15 that (since K 1s 
convex) every point on the line segment from Y to W is at least as far 
from X as Y is. Since there are but three vectors involved, this need only 
be verified in the plane where it says: If X and Y are points in the plane, 
if 15 a point such that the line segment from Y to W lies between the 
circles of radius | ¥ — Y| and | X — Y|-+ 206, and if ὃ 1s small, then W 
is close to Y. 


Exercises 


1. True or false? If f is a continuous real-valued function on R, and if Kisa 
compact subset of R, then f-!(K) ts compact. 


2. True or false? If F is continuous from R* to R” and S is a subset of R”, 
then F-!(S) is the closure of F~'!(S). 


3. If f and g are continuous complex-valued functions on R*, then {X; f(X) 
= g(X)?} is closed. 


4. Prove that the function 
Ε 
R* > R™ 
is continuous if and only if, for every set S < R”, the boundary of F-!(S) is 
contained in F-!(boundary S). 


5. Let D be a connected subset of R* and suppose that there is a continuous 
real-valued function on D which is nowhere 0. Show that the function 15 either 
everywhere positive or everywhere negative. 
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6. Let D be a disconnected subset of R*. Prove that there exists on R* a 
continuous real-valued function which is nowhere zero on D, yet takes on both 
positive and negative values at points of D. Hint: Use distance functions and 
Theorem 14 of Chapter 2. 


7. Let fbe a real function on R* such that f-!(V) is closed, for every open set 
V in R. What can you say about 32) Interchange open and closed. 
8. Prove the converse of Theorem 4: If K is a subset of R* such that every point 


of R* has a unique best approximation by points of K, then K is closed and con- 
vex. 


9. Let D be an open subset of R* and let F 


F 
D—> R™ 
be a continuous function on D. Let K be a closed subset of D. Prove that there 
exists a continuous function 
G 
R* —-> R™ 
such that G(X) = F(X) for all X ε K. 


10. If Ε 15 a closed convex set and Καὶ is a compact convex set, then E and Καὶ can 
be strictly separated by a hyperplane, i.e., there is a non-zero linear fand a num- 
ber ¢ such that f> c on E and f < cc on K. (Hint: What kind of a set is the 
algebraic sum E + (—K)?) 

11. An extreme point of a convex set Καὶ is a point Y € K which does not lie on 
the line segment joining any two different points in K, i.e., if X,, X, are in Καὶ and 
Y= 4X 1 + X2) 

then Δ΄, = X, = Y. 
(a) A closed convex set need have no extreme points. 
(b) A non-empty compact convex set has an extreme point. 
*(c) A compact convex set K consists of all convex combinations 


iY; τῇ τ. +4YV,, t; = 0, Day =!l 
j 


where Y;,..., Y, are extreme points of K. 


*12. If S is any set in R*, the smallest convex set which contains S consists of all 
convex combinations 


OY, Wes, %20 Yy=i 
a 7 


with πὶ ΞΞ k + 1 (all convex combinations of ᾧ + 1 or less points of S). 


9] 


3.3. The Limit of a Function 


Continuity can be formulated in terms of the concept of the limit of 
a function at a point. This concept is similar to the idea of the limit of a 
sequence, as we shall discuss later. 
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_ Definition. Let F be a function 


F 
D—> Κα, Dc RK. 

Let X, be a cluster point of the set D. We say that F has the limit L at 
X, provided 

(i) Le R™; 

(ii) if V is any neighborhood of L, there exists a neighborhood U of 
the point X, such that 

f{(X) eV forevery XE DOU,XFX). 
(Ὁ), (1) hold, we write 
L = lim F(X). 


X—Xo 


We say that F has a limit at X, or that lim F(X) exists provided there 
XX 
exists 1, € αὶ such that 
L= lim F(X). 


ΧΌ Χο 


(If the limit exists, it is unique. This follows from the fact that X, is acluster 
point of D.) 


Lemma. Let X, be a cluster point of the domain D, and let L be a point 

in R™. The following are equivalent. 
(i) limy x, F(X) = L; 

(ii) For each € > 0, there exists a number ὃ > Ὁ such that | F(X) — L| 
< € for every X in D such that0 <|X — Χο] - ὃ. 

(iii) Jf {X,} is a sequence of points in D such that X, τε Xo and {X,} 
converges to X,, then {F(X,)} converges to L. 

(iv) Jf Do = D U {X,} and if F, is the function on Dy defined by 
F(X), ifX € DandX FX, 
L, if X poe Xo 


then Ἐς is continuous at the point X,. 


F(X) = 


Proof. The point Χο is in the closure of D. It may or may not be in 
D. The definition of “the limit of F at X,” makes no mention of whether 
or not Y, Ε D, and if X, € D, the definition is completely independent 
of F(X,). This must be understood very clearly. Then, it should be under- 
stood clearly that (iv) is a reformulation of the definition. It is then 
apparent that (ii) and (iii) are reformulations of (i). 


Theorem 6. If X, is a cluster point of D, then F is continuous at X, if 
and only if 
lim F(X) = F(X)). 


XX 
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Lemma. Let Χο be a cluster point of D. These are equivalent. 


(1) F has a limit at Xo. 
(ii) Jf e > 0, there exists 6 > 0 such that |F(X,) — F(X,)| < ε for 
all points X,, X, in Ὁ such that 


0<|X,—X,|<6 
0<|X, — X,)|< 6. 


(iii) Jf X = lim X,, where X, ε D and X, τὸ Xo, then {F(X,)} is a 
Cauchy sequence. 


Proof. We content ourselves with verifying that (i) follows from (111). 
In other words, if (iii) holds, we must show that the limit of the sequence 
{ F(X,)} is independent of the sequence {Y,,}. So suppose we have two such 


sequences 
X,=lim¥, X,€D, X%,#*X% 


X, = lim Y re Ε D, ) oa Pee 


Let 
L = lim F(X,) 


M = lim F(Y,). 


The sequence X,, Y,, X,, Y2, X3, Y3,... converges to Xj. By (iii), the 
sequence F(X,), F(Y,), F(X,), FCY,),... 1s a Cauchy sequence. That is 
not likely to happen if L 4 M. 


Evidently, we now have several elementary facts which we can state. 
The limit of a sum is the sum of the limits, etc. Such results follow from 
the corresponding results for sequences, or the corresponding results for 
continuous functions. But, a better way to think of it is that the formal 
reasoning is the same in the sequence case as it 15 in the case of the limit 
of a function at a point. Some people find it helpful to regard as analogous 
the facts that (1) in defining lim F(X) we do not discuss F(X,); (2) a 


sequence {X,} has no last term. 

One word of warning is in order. The domain of definition of a func- 
tion is part of the function. The discussion of limits is one of the situations 
in which it is extremely important to bear this in mind. The same sort of 
rule may define functions on various sets, and a change in the domain of 
definition may affect the question of the existence of the limit at some 
point. Of course, we follow the usual convention that if a function is to 
be defined by a rule and if the domain is not explicitly mentioned, then 
the domain is the largest subset of R* on which the rule makes sense. 


EXAMPLE 9. The function f defined by 


fo) =F x0 
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does not have a limit at 0. That is, we cannot extend Κ᾽ to a continuous 
function on R. The function g defined by 
x 


ἢ x>0 
| x | 


g(x) = 


does have a limit at 0. 


EXAMPLE 10. Define 
f(x) = sin (=), x #0. 
Then f does not have a limit at 0, because / oscillates near 0 between the 


values 1 and —1. (See Figure 12.) For instance, the sequence 
2 


AN 


πη 


converges to 0, but the values /(x,) are 1,0, —1,0,.... 


" 


FIGURE 12 


EXAMPLE 11. Let 
g(x) = x sin (:} ᾿ x 0, 
Then 
lim g(x) = 0 
x0 


because | g(x)| < |x]. (See Figure 13.) Therefore we can extend g to a 
continuous function on R, by assigning the value 0 to the point 0. 
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arty εἴα 
y=x sin + 


iN AL 


weaker . 


FIGURE [3 
EXAMPLE [2. Let 
Se) ae 2 1. 2 : 


Then f does not have a limit at the origin. If x = r cos θ and y = r sin 6, 
then 


f(x, y) = 1 sin 26. 


Thus, the restriction of f to a line through the origin has a limit at (0, 0). 
The limits along those various lines tend to be different. 


EXAMPLE 13. Let F be the inversion function on the set of invertible 
k x k matrices: 


F(A) = Au," A invertible. 
Let Bbeak x k matrix which is not invertible. Then B is a cluster point 
of the domain of F, because the set of invertible matrices is dense in the 
space of k x k matrices. Does F have a limit at B? No. In fact, there does 
not exist any sequence of invertible matrices A, with B = lim A, for which 


the sequence {A4;'} converges. For, if we had 


lim A, = 8 
lim A;'=C 

then it would follow that 
lim J = BC 


and hence that B is invertible. 
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Exercises 


1. In the space of k Χ k matrices, 
lim e4 = J, 
A-90 


2. Let A be a fixed k Χ k matrix. Then 


Al pos, 
πε : ἘΘΘῚ 


1-Ὁ.0 


3. Let 
fx) =exp(=) x #0. 


Then f does not have a limit at 0. 
4. Let 
f(x) = exp (—). x #0. 


If & is any positive integer 
lim x~* f(x) = 0. 
x0 


5. Let ὃ = {z € C;|z| < 1} be the open unit disk in the plane. Let 


f™ =a — 2) exp (2 ++), ze D. 


Let’s prove that 
lim f(z) = 0. 
z>1 


(i) If w is a complex number, then 


|e| -- φρο), 
Gi) = -- ΝᾺ ere a er 
(iii) Τ᾿... Σ), ἢ, 
So 
lim f(z) = 0. 
6. If 


f(z) = α — ὃ exp (2 tt), ze, 


then f does not have a limit at z = 1. 
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7. Define a function f on the set of rational numbers, as follows. If x is a 
nonzero rational number, write x = p/g, where p € Z, gq © Z,, and p,q are 
relatively prime (no non-trivial common divisor); then let f(x) = 1/q. If x is 


irrational or zero, define f(x) = 0. Prove that 
lim f(t) =0 


1-5χ 


for each real number x. At which points is f continuous? 
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8. For functions on the real line, one can discuss left- and right-hand limits. 
Suppose 
ἢ---» Καὶ DcR 


and let x be a point which is a cluster point of D from both sides. Define this 
type of cluster point, and then define (if they exist) 


lim F(t) and lim Fi). 


fx LK, 


9. Let f be a real-valued function on the real line which is increasing: 


7) Ξ fO, %*<t. 


At every point x the left- and right-hand limits of f exist. (See Exercise 8.) Such 
an fis continuous, except at a countable number of points. (Use Exercise 12 of 
Section 2.5.) 
10. Let f be a real-valued function on D. Let Χο be a cluster point of ἢ. How 
would you define 
lim inf f(x) and _ lim sup f(x)? 
X—Xo X-Xo 


Every bounded function should have a lim sup and a lim inf at X,; and, the 
function should have a limit at Χο exactly when the lim sup equals the lim inf. 
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In order to work effectively in analysis, it is important to know how 
to handle such statements as 
(3.8) lim χε = 0 


x99 


(3.9) lim log |X| = —oo. 
X—0 


Such things do not present much more difficulty than the limits we have 
discussed thus far. As with most things that smack of “the infinite,” the 
secret of handling them rests not so much in knowing what to say as in 
knowing what not to say. 

The most concise way of dealing with limits such as (3.8) and (3.9) is 
this. We introduce the extended real number system, which consists of the 
set of real numbers together with two additional objects, co and —oo. We 
do not say what those objects are, except that we expressly assume that 
co ἐ Rand —oco ἐ R. The ordering of the real numbers is extended to 
the enlarged system by 


-τοο - οὦο 
—oo <x «- οο, χε ΚΑ. 


We can extend addition and multiplication only partially to the enlarged 
system 
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x-+co=co and x +(—w)= --οο, xER 
NOS ἔξοδο, x>0 
Xoo = —oo, x <0 
x(—0o) = —(x*), x #0 
co% == 00 
co(— co) = —oo 
(—00)? = 00, 


In short, a + ὁ is defined unless a, ὃ are co and —co in one or the other 
order; and ab is defined unless a = 0 or b = 0. The usual rules of algebra 
hold, in the sense that “any algebraic calculation we might make will be 
correct, as long as we avoid co + (—oo) and 0-(-+co)”. But, that sort of 
algebraic formalism is of relatively little use to us. 

In the extended real number system, a neighborhood of 09 is a subset 
which contains oo and also contains {x Ε R;x >} for some t € R. 
Replace co by —oo and x > t by x < ἴ, and we have the definition of a 
neighborhood of — co. 

Now we define limits as before, using neighborhoods; and statements 
such as (3.8) and (3.9) have a well-defined meaning. Suppose, for instance, 
that fis a real-valued function on a subset of Κα: 


f 
D— R, Dc R*. 
What does it mean to say that 
(3.10) lim f(X) = οο7 
XX oe 


It means this: X, is a cluster point of D and, for every neighborhood V of 

co there exists a neighborhood U of Χο such that f(U — {X,}) «- V. In other 

words, it means that for each real number f¢, there exists 6 > 0 such that 
S(X) >t, XeED, 0<|X— Χο] - ὃ. 


For another illustration, suppose that F is a vector-valued function 
defined on a subset of R: 
F 
Ὀ-- R", DCR. 
What does it mean to say that 


(3.11) lim F(x) = L? 


χλοο 


It means that “oo is a cluster point of D” and that, for every ε > 0, there 
exists ¢ € R such that 


| F(x) — Ll <e, ΧΕΡ, x>t. 


The terminology “oo is a cluster point of D” is not to be taken too seri- 
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ously. It means that D is not bounded above. Notice that, in case D is the 
set of positive integers, F is a sequence and the limit reverts to the limit 
of a sequence, as previously defined. 

From time to time, we shall use “infinite limits” as a piece of short- 
hand. Keep this in mind: If f is a real-valued function and if we say that ἢ 
has a limit at X,, that means that the limit exists in R, as in Section 3.3. 
If we wish to say that 

lim f(X) = oo 
X—-Xo 
we shall say precisely that. 

While we are on the subject of the extended real number system, we 
may as well formally extend the definitions of sup S and inf S to arbitrary 
sets S on the real line. In the extended real number system, every set S has 
a least upper bound and a greatest lower bound and they are denoted by 
sup S and inf S, respectively. Thus, for example, 


sup S = co 
means that S is non-empty but not bounded above by any real number. 


Some people prefer to apply sup and inf only to non-empty sets, because 
sup ὦ = —oo and inf @ = ov. 


Exercises 


1. Let f, g, and ἡ be real-valued functions on a set D in R*. Let Χο be a cluster 
point of ὃ. Suppose that f< g < A and that f, A have limits at Χο which are 
equal. Show that g has the same limit at Xp. 


2. Define what is meant by 


lim f(x) = —oo 


without mentioning co or —oo. 
3. If p is a polynomial with complex coefficients, 


lim p(x)e-* = 0. 


4. If fis an increasing function on the real line: 
fxs ὦ, χξι 
and if fis bounded, then 
lim f(x) 
exists. _ 
5. Find: 
(a) lim exp (1/x); 


(b) lim exp (1/x). 
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6. Let fbe a real-valued continuous function on the real line. Suppose that, for 
every € > 0, the set 
{x3 | f(x)| = €} 
is compact. Then 
ἽΝ 7) = 0. 


7. We defined neighborhoods of co and —oo in the extended real number 
system. With those neighborhoods, define open sets in the extended system, 
just as we defined them in R*. Then prove that the extended real number system 
is compact. 


3.5. Continuous Mappings 


This section contains two theorems which are (now) quite easy to 
prove, but which have widespread application. The theorems state that a 
continuous mapping preserves compactness and connectedness. A con- 
tinuous function F, 


F 
D —~> R", Dc R* 
has the property that F~!(V) is open relative to D, for every open set V in 
R™. As we have seen, this is the same as saying that F~1(S) is closed relative 
to D, for every closed set S in πὶ In general, the continuous function F 
does not preserve open sets or closed sets, as does F~!. But, it is true that 
F(K) is compact if K is compact and is connected if K is connected. 


Theorem 7. Let F be a continuous function. If K is a compact subset of 
the domain of F, then F(K) is compact. 


Proof. We may assume that K = D, because the restriction of F to 
K is a continuous function from K into R”™. Let {V,; « Ε A} be an open 
cover of F(D). Let 
ὕς ἐξ ,): 


Then {U,; « € A}is a cover of D. Since F is continuous, each U, is open 
relative to ἢ). Since D is compact, there exist indices «,,..., 0, in A such 
that 
D=U0,,U--: U ὕω 
Then 
LUD) GV Ae Κῶ. 


We have extracted from each open cover of F(D) a finite subcover: F(D) 
is compact. 


Corollary. If F 
F 
D—> R™ 
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is continuous and D is compact, then F is bounded and there exists a point 
Χο in Ὁ such that 


| F(X_)| = sup |{F(X)}; X ε DI. 


Proof. Since F is continuous, | F| is continuous. Thus, the range of 
[ΕἸ 15 compact and contains its supremum. 


Corollary. If f is a continuous real-valued function on a compact set D, 
there exist points X,, Χ, in D such that 


f(X,) = sup {f(X); X ε ἢ) 
f(X,) = inf {f(X);X ε D}. 


Notice that, if the domain of F is compact, then F does carry closed 
sets onto closed sets: Let K be a closed subset of D; then K is compact, 
because D is compact; so F(K) is compact and therefore closed. It does 
not follow that F carries open subsets of D onto sets which are open rela- 
tive to F(D). Remember that F does not usually preserve intersections, 
and so we would not expect preservation of closed sets to imply preserva- 
tion of open sets. 

Theorem 7 can be proved another way. Suppose D is closed and 
bounded and F is continuous. We can show that F(D) is closed and 
bounded in this way. Let {Y,} be any sequence of points in F(D). For each 
n, choose some point X, in D such that Y, = F(X,). Since D is bounded, 
the Bolzano-Weierstrass theorem tells us that there is a subsequence { X,,} 
which converges to some point XY. Since D is closed, X is in D. Since F is 
continuous, the sequence {Y,,} converges to F(X). We have proved that 
every sequence in F(D) has a subsequence which converges to a point in 
F(D). Thus, F(D) is bounded and closed. 


Theorem 8. Let F be a continuous function. If K is a connected subset 
of the domain D, then F(K) is connected. 


Proof. We may assume that K = D. Let S be a subset of F(D) which 
is both open and closed relative to F(D). Since F is continuous, F~1(S) is 
both open and closed relative to D. Since D is connected, either F~1(S) is 
empty or F-1(S) = D. Therefore, either S is empty or S = F(D). 


Corollary. Let D be an interval on the real line, and let f be a continuous 
real-valued function on Ὁ. Either f is constant or f(D) is an interval. 


Proof. The connected subsets of R are the empty set, sets with one 
member, and intervals. 


Corollary. Let D = [a, Ὁ] be a closed (bounded) interval on the real 
line, and let f be a continuous non-constant real-valued function on D. Then 
f(D) is a closed (bounded) interval. 
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Proof. The set f(D) is compact, connected, and contains more than 
one point. Of course, f(D) = [m, M], where 


γι = inf {f(x),a<x <b} 
M = sup{f(x),a< x < δ). 


The last corollary combines two theorems usually found in a calculus 
course. (1) A continuous real-valued function on a closed interval attains 
its maximum and minimum. (2) A continuous real-valued function on an 
interval has the intermediate value property: If f(x,) = y, and f(x,) = y,, 
then, between x, and x,, ftakes on every value between y, and y,. 

The intermediate value property tells us this about a continuous 
real-valued function f on an interval. The function f cannot be 1:1 unless 
J is strictly increasing 

| 70) < f(y), x<y 
or strictly decreasing 
f(x) > ff), x*<y. 


If we have, for instance, a strictly increasing continuous f on the interval 
[a, δ]. we can define the inverse function 5,1: 


f= 
[a, δ] <— [f(@), f)]. 


It is automatic that f~! is continuous in this particular case. 


Theorem 9. If f is a continuous and strictly increasing function on an 
interval 1 < R, then f(I) is an interval and 
f-1 
I<— f(I) 
is continuous, 

Proof. Since f maps each closed interval [a, ὁ] in J onto the closed 
interval [ f(a), /(5)], the image of the open interval (a, δ) must be [ f(a), f(b)] 
with f(a), f(b) deleted: 

F(@, 5) = (F@) £6). 


Thus f maps open sets onto open sets, i.e., f~! is continuous. 


For the case of a closed interval, Theorem 9 is a special case of the 
following very basic theorem. 


Theorem 10. Let F be a continuous function on a compact set D. If F 
is 1: 1, then the inverse function 
F-1 
D <— F(D) 
is continuous. 
Proof. Since D is compact, every closed subset of D is compact. 
Therefore, F carries closed sets onto closed sets. Since F is 1:1 the func- 
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tion F~! is defined, and if K is a closed subset of ὃ, 
(F-')"(K) = F(K) 


is closed. Thus F~! is continuous. 


EXAMPLE 14. Here is an illustration of Theorem 9. We could have 
proved the existence of nth roots this way. Let 
o> xX, x > 0. 
Since the domain D = [0, co) is an interval, and since f is continuous, 
f(D) is an interval. Observe that f(x) > 0, ΔΘ) = 0, and fis not bounded. 
Conclusion: f(D) = [0, co). So, every tf > 0 is the nth power of some x > 0. 
On D, fis 1: 1, so the inverse function 


fw) == fl/n 


is continuous. 


For the next few examples, we need to know that the exponential 
function is continuous. We shall prove this now, as a lemma. We suggest 
that (perhaps) the reader should look at the examples and then return to 
the proof of the lemma. 


Lemma. The exponential function on k Χ k matrices is continuous. 


Proof. By definition 


eel 
= lim p,(A) 
where " 
PAA) = = pA 


Since p, is a polynomial function, it is continuous. The p,’s approximate 
the exponential function, and so it seems plausible that exponentiation is 
continuous; however, such reasoning is not always valid. We need to be 
careful about how well p,(A) approximates e+. We know that 
= | 
jet — p,(A)| <3 {14}. 

Look at 

e4 — e® — et — PAA) a P(A) a PAB) ΞΕ »,.(Β) ae. 
We have 

|e* — e? |< |e* — p,(A)| + |»,(4) — p,CB)| + θ [»,(Β) — e* | 
τι | =, | 
< Σ glal’ + let) — p(B) + ΣΣ aq lB 


= |paA) — p(B)| ἘΣ HMA + [8}} 
This is valid for every n. 
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Fix A. Suppose we want to guarantee that |e4 — e?| < ¢€ if Bis suffi- 
ciently close to 4. If we at least keep | A — B| < 1, then| B] < 1 -- [4 and 


[4.. -Ε[8}Ἐ < 201 4+] A))¥. 
Thus, 
|e* — e? |< |p,(A) — p,(B)| + R, 
where 


R, <2) gy +1 ADA, 
Now | + |A| is some fixed positive number and so 
=, | 
Σ pill + ADE < 0. 
k=1 ° 


If € > 0, we can choose Ν so that 
aan | 


tf k Cn 
(3.12) Σ χει + 14D <4; 
and, for that particular NV 
jet — 65] -- [ρν(4) — ρν(β)} +52 4-- BI <1. 


Retain the fixed A and the fixed N (which depends upon 4). Since 
Py is continuous, there exists 6 > O such that 


|pw(A) — ρν(Β)} <-> |A—~- Bl <o. 


We may assume 6 < 1, and then we have 
(3.13) |e4 — e? |< e, [4 — B| <o. 


EXAMPLE 15. Let’s look at the exponential function 
f(x) = ὁ", xER 


on the real line. We know that fis continuous, and Example 9 in Section 
2.3 showed us that 


fx += ff. 
From the series definition, it is obvious that f(x) > 1 if x > 0. Since 
e*e-* = |, clearly f(x) < 1 if x < 0. Therefore f is strictly increasing: 


er = oF “e* = aX: 


So the image of fis an interval contained in the interval (0, oo). Plainly 
fis not bounded, because 


f(a) = fy" 
ee 
Hence, the image of fis the interval (0, co). Conclusion: If y > 0, there 
exists one and only one real number x such that 


ye’. 
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The inverse function f~! is called the (natural) logarithm function 


x = log y, 
1.€., 
1 ἘΞ. 
e og y¥ ΞΞΞῚ γι 


and that function is continuous. 


EXAMPLE 16. Suppose we did not know that every complex number of 
absolute value 1 had the form e”, for some real ¢. Let’s see how we could 
deduce that from things we have developed in this book. Let F be the 
map from R into C defined by 


F(t) = e*, te R. 
Since ee" = e° = | and e~* = (e*)*, we have 
[εξ 1. 
Thus F maps R into the unit circle. Why onto the circle? Certainly F is 
not constant, because F(0) = 1 and 


= Qn 
FQ)=1+ 3 Se 
so that 
Re FQ) = δ (—1)¥ 4 
eF2) = 24 (— Dap 


=1—244%-.-. 
« 0. 

What do we know about the image of F? It is a connected subset of 
the unit circle. It contains 1, and it contains the number w = F(2) which 
satisfies Re w < 0. Therefore, either —w or —w* is in the image of F. 
(Otherwise, the line through —w and —w* would disconnect the image. 
See Figure 14.) But, the image of Fis symmetric about the real axis because 
F(—t) = F(t)*. Thus —w is in the image of F: 

—w = F(a), ac R. 
Since F(2)F(a — 2) = F(a) == —F(2) we have F(a — 2) = —1. The fact 
that both 1 and —1 are in the (connected) image makes it apparent that 
the image of F is the full unit circle. 


The function F(t) = οἷ is certainly not 1:1. In fact (as you know) it 
is periodic. If F(b) = —1, then F(2b) = 1 so that the set 


S = {t € R; F(t)= 1} 
contains a non-zero number. Let 
c= inf {te S;t>0} 


and it is not too difficult to show that c > 0 and that S consists of all 
integer multiples of c. The number c has come to be called 2z. 
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FIGURE 14 


EXAMPLE 17. We cannot delete from Theorem 10 the hypothesis that 
D is compact. Look at the map 


(3.14) Ft)=e", O<t<2az. 


Then F is continuous and 1:1, mapping the interval [0, 27) onto the 
unit circle in the plane. But ΚΕ 1 is not continuous because it “jumps” at 
z= ἢ, 


EXAMPLE 18. We have the exponential map 
F(A) = e4 


from k x k matrices into the invertible k x k matrices. Is it onto? L.e., 
does every invertible matrix have a logarithm? Not in the real case. The 
image of F is connected, because the space of all k x k matrices is con- 
nected. (It is R*’.) But the set of invertible k Χ k real matrices is discon- 
nected by the determinant function. It is the union of the matrices with 
positive determinant and those with negative determinant. Clearly then 
det e4 > 0, for all real kK Χ k matrices A. In the complex case, F does 
map onto the set of all invertible matrices, but we'll worry about that 
later. 


Exercises 


1. True or false? If S is a subset of R and 
52 = {x2;x € S$} 
is compact, then S' is compact. 


2. If wis a complex number and w + 0, then w = 62 forsomez Ε C. 
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3. The matrix A is called skew-symmetric if A‘ = — A. Show that, if A is skew- 
symmetric, then e4 is an orthogonal matrix. Is every. orthogonal matrix of that 
form? 


4. The set of points (x, y) € R? such that x? + y3 is irrational is disconnected. 


5. True or false? The set of points (x, y) ε R? such that either x or y is rational 
is a disconnected set. . 


6. Prove Theorem 10 using sequences. 


7. Prove that every continuous function 


[0, 1] ae [0, 1] 
has a fixed point (a point x such that f(x) = x). 
8. Prove that every continuous map of [0, 1) onto [0, 1) has a fixed point. 
9. Prove that there does not exist any 1:1 continuous map of (0, 1) onto [0, 1]. 


10. Let I be the closed interval [0, 1] and let S be the closed square J Χ Jin R2. 
It is a (Somewhat painful) fact that there exists a continuous map of 7 onto S. 
You are not expected to prove that, but prove that there does not exist a 1:1 
continuous map of J onto S. Hint: What happens to J when you throw away 4? 


11. True or false? If 


| 
Re —> R™ 
and if F~1(K) is compact for all compact Καὶ, then F is continuous. 


12. A function is called a proper map if it is continuous and the inverse image of 
every compact set is compact. 


(a) True or false? If F is a proper map and K is closed relative to the domain 
of F, then F(X) is closed. 
(b) True or false? If F is a proper map and U is open, then F(U) is open. 


13. If K, and K, are compact sets in R”, then the algebraic sum 
Κι + Κῶ ={X%, + X2; X; € K;} 
is compact. 


14. True or false? If Tis a linear transformation from R* into Καὶ and if U is an 
open set in R*, then ΤΩ) is open relative to the image of T. 


15. Let D be a compact set in R*; let F be a function from R* into R”, and sup- 
pose F is bounded. Then F is continuous if and only if the graph of F is a closed 
subset of R**™, 


16. Let f be a non-zero continuous function from the real line into the complex 
numbers which satisfies 


f(x +t) = fOFO 


for all x, ¢. Prove that there exists a complex number z such that f(t) = e” for 
all 1. 
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3.6. Uniform Continuity 


The motivation for our definition of continuity was the idea of a 

function 
F 

(3.15) D—— καὶ Dc R* 
such that F(X) is near F(X,) provided X is sufficiently near Δ. But, as we 
defined continuity, the “sufficiently near” could vary from one locality to 
the other in the domain ἢ. Given € > 0 and given X, © D, we can find 
ὃ > 0 such that 


| F(X) — FLX,)| - ε, |X — X5|< ὅδ. 


The “sufficiently near” is measured by 6, which presumably depends on 
Χο as well as on ε. For some functions, the 6 can always be chosen so as 
not to depend on Xj. 


Definition. The function F (3.15) is uniformly continuous if, for each 
€ > 0, there exists a number 6 > 0 such that 


IF(X,) — ΧΩ] τε, Χ,, Χ; © Ὁ, [Xi — Xl - δ. 
EXAMPLE 19. Let us look at (what is perhaps) the simplest example of 
a continuous function which is not uniformly continuous. Let 
fies ee x ER. 


Now 
fX)— fO=x* —? 
= (x + tx — ἢ. 
Suppose € > 0 and we wish to find ὃ so that 
(3.16) [70 )-- fO|<e |[x—t|<o. 


If we fix t, we can do that because x + ¢ does not become too large near 
any fixed ¢. But, if we try to do it for all points x, ¢ simultaneously, that 
factor x + ¢ is not bounded and we get into trouble. For instance, if 
(3.16) holds, then it holds for every ¢ > 0 with x - t + 46; hence 


1 .\ ὃ 
(2¢ ae +6) 5 <€, t> 0. 
That implies 6 = 0, which won’t do. 


At times it is convenient to reword the condition of uniform conti- 
nuity in the following way. Suppose we have a function F with non-empty 
domain D. For each positive number ἡ define 


ao(F, 4) = sup {| F(X) — F(T)|; X,T © D,|X — ΤΊ < ἡ}. 
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This function ὦ is called the modulus of continuity of F (whether or not F 
is continuous). Evidently we have 
0< al F, 4) <0 


for each 4. Roughly speaking, the value w(F, 7) measures how far apart 
F can send any two points in its domain which are themselves not more 
than y apart. The function F is uniformly continuous if and only if 


(3.17) lim w(F, 4) = 0. 
4-+0 
For instance, if F is uniformly continuous and ε > 0, there exists 6 > 0 
such that 
| F(X) — F(T)| <e, X,TED, [Χ--Τὶ - δ. 

Thus we have w(F, 7) < ε for all ἡ < 6. It is equally easy to see that, if 
(3.17) holds, then F is uniformly continuous. 

The next theorem provides us with many examples of continuous 


functions which are not uniformly continuous. The subsequent theorem 15 
more positive. 


Theorem 11. If F is uniformly continuous and if the domain D has com- 
pact closure (i.e., if D is bounded), then F is a bounded function. 


Proof. Apply the definition of uniform continuity with e = 5. There 
exists ὃ such that 


| F(X,) — F(X,)| < 5, X,E D, |X,—X2|<o. 
The family of open balls BCX; δ), as X varies over D, is an open cover of 
D. Thus there exist points X,,..., X, in D such that the balls B(X;,; 6) 


cover D. In particular, if ¥ © ἢ then X is within distance 6 of one of the 
points X,,..., X,. Because of the way 6 was chosen we then have 


|F(X) — F(X) | <5 
for some j. Clearly | F| < M, where 
M = max (| F(X;)| + 5). 
Here is (the sketch of) another proof. Suppose F is not bounded. 
Choose points XY, in D such that 
(3.18) lim | FCX,)| = οο. 


Some subsequence of {X,} converges in R*. The limit point need not be 
in D, but that is irrelevant. After a while, all the points in the subsequence 
are less than 6 (3.17) apart. Thus, after a while, all the corresponding 
values of F are less than 5 apart. That contradicts (3.18). 


Theorem 12. Let F be a continuous function. If the domain of F is com- 
pact, then F is uniformly continuous. 
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Proof. Let € > 0. Let X ε ἢ. Since F is continuous at_X, there exists 
a number ὃν > 0 such that 


|F(Y) — F(X)|<4de, YeD, |Y—X|<6y. 


Let Uy = B(X; 46,). Then {U,; X € D} is an open cover of ὃ. Since D 
is compact, there exist X¥,,..., X, in D such that the sets U,, cover D. 
The fact that these sets cover D means that if X¥ € D there exists j/, 1 <j 
<n, such that 


(3.19) |X — X;| < 6x, 
Let 
ὃ = min 46x, 
ji 
Now, let X¥ and Y be any two points in ἢ such that | ¥ — ΥἹ < δ. Choose 
an index 7 such that (3.19) holds. Then 
[Y—X,|<|Y—X|+|xX—X,| 


<6 + 46x, 
=< ὄχ; 
By (3.19) 
| F(Y) — F(X))| < δε 
| F(X) — F(X))| < $e. 
Thus 


| F(X) — FCY)| < e. 


This holds for any two points in D which are less than 6 apart. We have 
proved that F is uniformly continuous. 


The proof of Theorem 12 is not trivial; but, it is a very important 
theorem and the proof must be understood thoroughly. Here is another 
proof: Let ε > 0. Suppose we cannot find a suitable 6. Then, for each n, 
we can find points X,, Y, in D with 


x, —Y,| a= and |F(X,) — F(Y,)|> ε. 


Since D is compact, there exists in D a cluster point XY of the sequence 
{X,}. Now F is continuous at the point XY. So there exists α > Ὁ such that 
| F(Y) — F(X)| < fe, |\Y—X|<a. 

Pick some n > 2/a such that 
a 


We then have 
| F(X,) — F(X)| < ge 


|F(Y,) — F(X)| < de 
|F(Y,) — F(X,)| εἴ 
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Before we look at some additional specific examples, let us tie together 
Theorems 11 and 12. The following result makes Theorem [1 rather 
obvious. 


Theorem 13. Let F be a uniformly continuous function on D. Then F 
can be extended to a continuous function on the closure of D, i.e., there 
exists a function F 


_ F 
D —-+» ΚΙ 
such that 
(a) F is continuous; 
(b) F(X) = F(X) for each X in D. 


The extension F is unique, and F is uniformly continuous. 


Proof. Obviously there could not be more than one such extension: 
Suppose F satisfies (a) and (δ). Let X be any point in D, There exists in D 
a sequence {X,} which converges to X. Since F is continuous, 


F(X) = lim FCX,) 
= lim F(X,). 


That shows that F(X) is uniquely determined by F. 

But, why does any such extension exist? We shall define F. Motivated 
by the previous paragraph, we proceed as follows. Let X ε D. Choose a 
sequence {X,} in D which converges to XY. We'll prove that the sequence 
{F(X,)} converges; and then, we'll define 


(3.20) F(X) = lim F(X,). 


Since F is continuous, we shall have F(X) = F(X) for X ε ὃ. All that 
we must do then is to prove that F is uniformly continuous. 

Everything is taken care of by this observation: If {X,} is a Cauchy 
sequence in D, then {F(X,)} is a Cauchy sequence. That is where we use 
the uniform continuity of F. We have left the proof to the exercises. 
Suppose we have verified that fact about Cauchy sequences. The existence 
and uniform continuity of F then go like this. 


(i) Pick any X¥ ε ἢ. Let {¥,} be a sequence in D which converges to 
X, Then {X,} is a Cauchy sequence; so {F(X,)} is a Cauchy sequence in 
R” and therefore converges to a point in R™. Define F(X) to be the limit 
of F(X,). In case X is in D, we will have F(X) = F(X) because F is con- 
tinuous at YX. 

(ii) Let ε > 0. Choose ὃ > 0 such that 


(3.21) | F(X) — F(Y)| < ε, X,YeED, |X—Y|<6. 
Let X, Y be in δ. We have the corresponding sequences {X,} and {Y,} in 
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D which were used to define F(X). Suppose |X — Y| < ὃ (3.21). Then 
|X, — Y,| - ὃ, n> N. 
Consequently (3.21), 


FX) FO) Se, WSN. 
By (3.20) Ε 
| F(X) -- F(Y)| <e. 


Thus F is uniformly continuous. 


We should remark that the uniqueness of the extension tells us that 
the vector F(X ) defined in (i) did not depend upon the particular sequence 
{X,,} which we used. The fact that F(X) is independent of {X,} is also clear 
from the special case of (ii) in which X¥ — Y. 


EXAMPLE 20. Let’s look at three functions on the interval (0, 1): 


f@=— 0<x< 1 
g(x) = sin» O<x<I 


A(x) = x? sin τ, Oa x =< I. 


Each of these functions is continuous. The function f is not uniformly 
continuous. This can be seen several ways. For instance, (0, 1) is bounded 
but fis not bounded. The function g is bounded, but g is not uniformly 
continuous. For one thing we obviously cannot extend g to a continuous 
function on the closed interval [0, 1], because g does not have a limit at 0. 
Now ἢ is uniformly continuous, because 


lim A(x) =O and lim A(x) = sin | 
x->1 


χοροῦ 


so, has a continuous extension to [0, []. 


EXAMPLE 21. Let ἢ, σι. 4, be the functions on (0, co) defined by the 
formulas for f, g, A in Example 20. Then f, and g, cannot be uniformly 
continuous, because f and g are not. What about h,? We can extend h, 
continuously to the closure of the domain, 1.e., to [0, oo); however, that 
doesn’t necessarily guarantee that ἢ, is uniformly continuous, because 
[0, co) is not compact. Now 

lim h,(x) = co 
because 
lim x sin + = lim 20 


x-?700 t-0 t 
=f; 


Thus /,(x) behaves like x for large x. No function (other than a constant) 
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“x 3 5: 


could be more uniformly continuous than 
Ix —t|<e if Pe cedieae 


Thus, it seems likely that A, is an (unbounded) uniformly continuous 
function. Such is the case. We omit the details. 


EXAMPLE 22. Suppose 7 
T 
R* R™ 


is a linear transformation: 7(cX, + X,) = cT(X,) + T(X,). In Example 
7 we saw that every linear transformation is uniformly continuous. This 
is easy to see. Let ε > 0. Since T is continuous at the origin, there exists 
6 > 0 such that 


(3.22) | T(X)| < ε. |X| < ὃ. 


(Remember that 7(0) = 0.) That same 6 will work at every other point 
Χο. Why? Well, since T 15 linear, 


(3.23) T(X) — T(X,) = T(X — Xo) 
so that (3.22) tells us 
|7(X) — Τί.) < ε, |X — Χο] - ὃ. 
Essentially the same argument establishes the uniform continuity of 
any affine transformation: 
A(X) = Y, + T(X) 


where T is linear and Y, is a fixed vector in R”. 

For linear transformations, there is a concise way to describe the 
uniform continuity. Suppose we determine the 6 for some ε, as 1n (3.22). 
Let X be any non-zero vector in R*. Then 


re*| = 4 
SO 
ra) |< 
ma | TX) | ΞΞ ε 
IT(X)|<£ 1x1. 


That is, 7 does not change length by a factor of more than εἰδ. The norm 

of T is defined to be 

(3.24) IT || = sup ue 
= sup {|7(X)|;|X| = 1]. 

Then we have 


(3.25) ITX)I STIX; Χε Re. 
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And || T || is the smallest positive constant which will suffice on the right- 
hand side of (3.25). The delta-epsilonics is now trivially described : 
TOO 26,. Jie 

EXAMPLE 23. Suppose ¢ > 0. How do we define ¢*, where χε R? 

That’s easy: 
{5 = exp (x log ἡ. 

For purposes of illustration, suppose we did it this way. After we have 
nth roots of positive numbers, it is easy to define ¢* when x is rational: 


pmin — ἢν; 
Then, for rational b, ἃ 
pate —_ tr 
{68 Ξε: (Υ. 
[t is also easy to verify that 
lim ¢* = [. 


x0 
Since x is only rational, we should say 
(3.26) lim f(x) = 1 
where 
JO ΞΕ; xe Q. 

Is f uniformly continuous on Q? Well 

f(b) -- fla=t— 1 

(aie), 

That factor ¢ suggests that fis not uniformly continuous on all of Q; 
however, on any bounded interval the uniform continuity is apparent 


from (3.26) and (3.27). So we can extend fcontinuously to that full interval. 
That defines ¢*, for all real x. 


(3.27) 


EXAMPLE 24. Let K be the Cantor set (Example 15 of Chapter 2). 
We obtain K by deleting the open interval (4, 4) from [0, 1], and then 
deleting the middle thirds of the two remaining intervals, etc. Let D be 
the union of those open intervals. We define a function fon D by 


on (4, 4) 
on (4, 4) 
on (ᾧ, δ) 
on (χ}7γ. 44) 


on (ζγ 7.) 


Ι 


Ι 


᾿ 


| 


SS 
Ι 
οὐ a el Θ΄ ΒΡ Ὁ 
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FIGURE 15 


(See Figure 15.) It should be clear that 


If) —fO)| <2", if[x—yl <3. 
Thus, f is a uniformly continuous function on D, and so has a continuous 
extension to D = [0, 1]. The extension / is called the Cantor function. 
Notice that its graph is flat practically everyplace, but f(x) still manages 
to climb from 0 up to 1, continuously. 


EXAMPLE 25. One nice class of uniformly continuous functions con- 
sists of the contractions: 


|F(X) — FY) |< ΊΧ — YI. 


For such functions “we can always take ὃ = e”. A rigid motion (or con- 
gruence) of R* is a function 


T 
R* —~+» R* 
such that 
(3.28) |7(X1) — TX2)| = [41 — Χ,]. 
Euclidean geometry is essentially the study of those properties of sets 


which are invariant under (unaffected by) rigid motions. One example of 
a rigid motion is translation 


T(X ) == X, 0 Ἢ ΧχΧ . 
Another example is an orthogonal (linear) transformation: 


(a) T is linear; 

(0) <7(X,), ΤΑΧ2)Σ = <M; X,>. 
Such a linear transformation on R* is one which can be represented by an 
orthogonal k x k matrix A, i.e., one for which the rule is T(X) = AX 
where AA‘ = I. 
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Theorem 14. Let Y, be a fixed vector in R* and let T be an orthogonal 
linear transformation on R*. Then the (affine) transformation 
(3.29) S(X) = Y, + T(X) 
is a rigid motion of R*. Furthermore every rigid motion of R* has this form. 

Proof. We have already remarked that a transformation of the form 
(3.29) is a rigid motion, provided T is an orthogonal linear transformation. 
It is the converse which must be verified, that every rigid motion is an 
orthogonal transformation followed by a translation. So, let S be any 
rigid motion of R*. Let Y, = S(O) and define T by 

T(X) = 5(Χ) — Yj. 


Since S preserves distance, T does also. Therefore T is a rigid motion and 
T(0) = 0. If we apply the “rigidity” property 


(3.30) | T(X) — T(X2)| = |X — X2| 
to the case X, = Ο and use the fact that 7(0) = 0, we obtain 
(3.31) ΙΤ(Χ] -Ξ [ΧΙ]. 


Now we can verify easily that T preserves inner products. We have 
|X, — X2/? = 1X1? + |X)? — 2%, X> 
|T(X) — TX)? = | TX)? + | T(X2)? — 247 X)1), Th) 
which, if we use (3.30) and (3.31) yields immediately 
<T(X;), T(X2)> = <X1, Xd. 


A transformation which preserves inner products is necessarily linear. We 
have left the proof as an exercise. 


Corollary. Every rigid motion of ἈΠ maps R* onto R*, and its inverse 
map is a rigid motion of R*. 

Proof. A linear transformation trom R* into R* which is 1 : 1 neces- 
sarily maps R* onto Κ΄. 


Exercises 


1. Let F be a uniformly continuous function. If {.X,,} is a Cauchy sequence in 
the domain of F, then {F(X,)} is a Cauchy sequence. 


2. If F is a function from R* to R”, we say that F vanishes at infinity if, for 
every € > 0, the set {.X; | F(X)| > ΕἸ is compact. If F is continuous and vanishes 
at infinity, then F is uniformly continuous. 


3. True or false? The map F = (fj,...,f,) is uniformly continuous if and 
only if each f; is uniformly continuous. 
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4. True or false? The continuous function F is uniformly continuous if and 
only if F has a limit at each cluster point of its domain. 


5, True or false? If fis a uniformly continuous function on the real line then 
the difference quotient 


f(x) — Κῶ. 


χτεί 
ΧΟ ΞΞ 4 


is bounded. 
6. Let F be a uniformly continuous function 


F 
Ρ---» κ᾿" 


and let F be the extension of F to D. True or false? The graph of F is the closure 
of the graph of F. 


7. True or false? If fand g are bounded uniformly continuous functions on D, 
then fg is uniformly continuous. 


8. True or false? If fis a uniformly continuous function on R, then g(x, y) = 
S(x) — f(y) defines a uniformly continuous function cn R?. 


9. True or false? If 
F 
D —»> R™ 


is uniformly continuous and S (ἡ D is bounded, then F(S) is bounded. 


10. Let T be a linear transformation from R* to πὶ Let E,,..., E, be the 
standard basis for R*. Let 


T(E;) = (α,)» A2j,-- +5 Amj)s 1 -Ξ:7 33 Κ. 


The m Χ k matrix A = [a,,] is the standard representing matrix for T. It com- 
pletely determines 7, as follows. If Y = T(X), then 


Yt'= AX 
where X¢ 1s the transpose of X: 
x1 
APs eh, etc. 
Xk 


Verify this and then verify that T is orthogonal from R* to R* if and only if A 
is an orthogonal matrix. (It might be more natural to let A‘ be the representing 
matrix, so that Y = TX) means Y = XA; but, analysts write functions on the 
left, and so we do it the other way.) 


11. Recall the definition of the norm of a linear transformation, in Example 22. 
Let’s look at linear transformations from Κα into ΚΑ, 
(a) || Ty ο 71} < |] 714 I] 7}. 
(Ὁ) || 7 || << |.4|, where A is the standard representing matrix (Exercise 10). 
(c) Give an example where || 7'|| < | A]. 
(d) If || 7 || <1, then J — T is invertible. 


(Think about the relation to Example 9 of Chapter 2.) 
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12. Let 7 be a linear transformation from R* into Ré. 


(a) Tis an orthogonal transformation if and only if the standard representing 
matrix A is an orthogonal matrix. 
(b) 7 is an orthogonal transformation if and only if 7 is invertible and 


PU] = 7-8 It = 1. 
13. Show that any transformation from R* to R* which preserves inner products 
is linear. 
14. Let 7 be a function from R* into R” which is additive 
T(X, eT X2) = T(X;) oF T(X,). 


Suppose that there exists some neighborhood of the origin in R* such that the 
restriction of JT to that neighborhood is a bounded function. Prove that T is a 
linear transformation. 


4. Calculus 
Revisited 


4.1. Differentiation on Intervals 


This chapter is a review of what one might call the intellectual skeleton 
of calculus. It is not intended to be a comprehensive review of the subject 
because it does not deal extensively with the techniques or applications of 
calculus. The focus is on the basic concepts and theorems, some of which 
we will need to extend beyond the level of generality customarily found in 
a calculus course. 


Definition. Let I be an interval on the real line, and let F be a function 
from 1 into R™. If x is a point of I, we say that F is differentiable at x if the 
limit 
(4.1) lim FO — FQ) 

t--x — xX 
exists. If F is differentiable at x, the limit (4.1) is called the derivative of F 
at x and is denoted by F'(x) or (DF)(x). The function F is called differentiable 
if F is differentiable at each point of I. 


If F is any function from J to R” and if we fix a point x € J, the 
difference quotient 


F(t) — F(x) 
t—x 


is a function defined at all points t εξ J such that t 4 x. Thus, it makes 
sense to ask whether the limit (4.1) exists. Suppose it does, and let G be 
another function from J to R® which is differentiable at x. From the 
corresponding properties of limits, we see that F +- G is differentiable at x 
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and 
(4.2) (F + G)(x) = F(x) + G(x). 


If F and G are (complex) matrix-valued functions, then FG is differentiable 
at x and 


(4.3) (FG)'(x) = ΕΟὐσ' Οὐ + F’(x)G(x). 


The product rule (4.3) is derived by adding and then subtracting a term 
to the difference quotient for FG: 


(FG)(t) — (σα) _ FGM — F(x)GO) 
t—x t—x 


— F(t) Gt )— ΟΣ F(t )— F ee) G(x). 


As t approaches x, F(t) approaches F(x) and the difference quotients 
approach G’(x) and F’(x), respectively. The result follows from the rules 
for limits of sums and products. 
In the last paragraph we used the fact that, if F is differentiable at x, 
then 
lim (F(t) — F(x)] = 0. 


This is apparent, since in (4.1) the denominator of the difference quotient 
is tending to 0 and the quotient could not approach a limit unless the 
numerator were tending to 0 as well. What this says is that differentiability 
at x implies continuity at x. The reader should be aware that the converse 
is false, as the example f(x) = |x| at the point x = 0 shows. 

If F is differentiable at each point of J, then DF = F’ is a function on 
I, called the derivative of F, and it makes sense to ask whether DF is 
differentiable. If it is, we denote its derivative by 22} or Ε΄. The successive 
derivatives (if they exist) are denoted by F” = D"F. We say that the 
function F is of class C* provided the kth derivative D*F exists and is 
continuous. If F has derivatives of all orders, then F is of class C”. 

If J is a semi-closed interval, then, at the closed end, the derivative 
is only a one-sided derivative; e.g., from within J = [a, b) approach to a 
is possible only from the right. At times, it is useful to have left- and right- 
hand derivatives at interior points of the interval. Let us discuss this 
briefly. 

Suppose F is defined on a neighborhood of x. We consider the dif- 
ference quotient 


P(x) — FO x<t<x4+06. 


X= 
If that quotient has a limit at x, it is the right-hand derivative of F at x: 


(4.4) (D* Fx) = lim FOO — FO. 
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The left-hand derivative of F at x (if it exists) is defined similarly 
(4.5) (D-F)(x) = lim Fe) = FO 


Either of the one-sided derivatives may fail to exist. They may both exist 
and be different, as the example f(x) = |x| at x = 0 shows. Evidently, 
(DF)(x) exists if and only if (D* F)(x) exists, (D~ F)(x) exists, and (D* F)(x) 
= (DF)(x). 

Any map 

F 
[—-> k” 

is described by an m-tuple of real-valued functions, F = (/,,...,/f,,). 
From the corresponding property of limits, it is immediate that F is 
differentiable at x precisely when every f; is differentiable at x. In this case 
F'(x) = (f(x), ..- > f,(x)). Therefore, the entire discussion of differentia- 
tion could be carried out using only real-valued functions; however, this 
is not an advisable way to proceed. It is much better to learn to handle 
differentiation of vector-valued functions directly, without using coor- 
dinates. 

This seems an appropriate time at which to discuss the special features 
of differentiation of real-valued functions. If fis a differentiable function, 
then f(x) is the slope of the tangent line to the graph of f at the point 
(x, f(x)). It measures the rate at which f is increasing at x. Thus, the 
tangent line to the graph is horizontal at points where f has a maximum 
or minimum. This simple observation has a number of important con- 
sequences. 


Theorem 1. Let f be a real-valued function on an interval (a, b). Let 
x € (a, b) and suppose that 


(i) f has a local maximum (or local minimum) at x; 
(ii) f is differentiable at x. 


Then f"(x) = 0. 
Proof. To say that fhas a local maximum at x means that there exists 
ὃ > 0 such that 
(4.6) fO<sf™), |[x—-tl<o. 
From (4.6), it is clear that (D*tf)(x) < 0, if (D*f)(x) exists. Similarly 


(Df )(x) > 0 if (DSf)(x) exists. Hence, if they both exist and are equal, 
their common value must be 0. 


Corollary (Mean Value Theorem). If f is continuous on [a, b] and dif- 
ferentiable on (a, Ὁ). there exists an x Ε (a, Ὁ) such that 


(4.7) f(x) = Hb) = Fa). 
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Proof. Let 
a(x) = f(b) — f(x) + LO—LO x — Ὁ). 


Then g satisfies the same conditions as f and g(a) = g(b) = 0. The con- 
tinuous function g attains its maximum and minimum on [a, δ]. One of 
those must occur at a point x in the open interval (a, b). Then g’(x) = 0, 
which gives us (4.7). 


Corollary. Suppose f is differentiable on (a, b). Then f is an increasing 
function if and only if f' > 0; f is constant if and only if {΄ = 0. 


Proof. The mean value theorem for derivatives. 


Corollary (Chain Rule). Let F, g be differentiable functions such that 
the image of g is contained in the domain of Ἐς 


g F 
(a, b) ——> (c, d) —> R™. 
Then the composition F o g is differentiable and (F o g)' = ν΄ (Ε' o g), i.e., 
(F o g)'(x) = g'(X) F(g(x)). 
Proof. Fix x & (a, δ). From the definition of derivative 
F(y) — F(g(x)) = [vy — g@)F’(e@) + & — αῦ RO, 50} 
where 
Jim RO, 8) = 0. 
Thus, 


Flg() — Μεθ. εὦ -- 8) p(g(x))| <| 8 O=£)| RCC, ἐὺ)! 


Corollary. If g is a differentiable real-valued function on the interval 
(a, Ὁ) such that g'(x) #0 for each x Ἑ (a,b), then g is 1: 1, the inverse 
function g~' is differentiable, and 
τι ΕΝ Ἔ 
(Dee) = τῷ 
Proof. If we had g(x) = g(t) with x ~ ¢t, the mean value theorem 
would provide a point c between x and ¢ with g’(c) = 0. Hence g is 1: 1, 
that is, g is either strictly increasing or strictly decreasing. As we observed 
in Chapter 3 (Theorem 9) this means that the image of (a, δ) is an open 
interval and σ΄ is continuous. Thus, 


τ —_ tm 5. — ee) _ yt x 1 
CeO) ay = eG oO 


EXAMPLE |. Let A be a fixed A x k matrix with complex entries. Let 
F(t) = e4, te R. 


Sec. 4.1 Differentiation on Intervals 
Then F is a differentiable function. We have 


F(t) — F(x) _ @'4 — e*4 
t—x 1-- χ 


What we must investigate is 


lim coed 
t-0 
Now 
δ γε ΕΝ 
μΞ 1 ἢ 
so that 
t~'(e'4 = I) — A — > | potas 
n=2 Mh: 
ὅς Ὁ] 
Jer — 1) — ALS lel 4} 
-|Al..__|t{] Al 
2 T= 1147 
Now it is clear that F is differentiable and 
(4.8) F'(x) = Ae*4 = AF(x). 


Since F’ = AF, it follows that F is a function of class (Ὁ. 

We remark that the differential equation in (4.8) determines the 
exponential function. If G is a differentiable function from R into C*** 
and if 

G'(x) = AG(x) 
where A is some fixed matrix, then e~*4G(x) has derivative 0. Thus, 
G(x) = e*4B 
where B = G(0). Notice that, if G(x) = Be*4, then in order that G’ = AG 
be satisfied, B must commute with A, i.e., AB = BA. 

The most important special case is, of course, the real-valued func- 

tion 
7() = @, te R 
which satisfies f’ = αὶ Since Κ΄ (ὦ + 0, the last corollary tells us that the 
inverse function f~!, which we defined to be the log function, satisfies 


(Ὁ loge!) =e or (Ὁ logXx) = — 
EXAMPLE 2. Let F be the complex-valued function on the real line: 


F(x) = e’*. According to Example 1, this is a function of class C” and 
F™ (x) = i"F(x). Notice that, if we did not know about sines and cosines, 
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We could define the sine and cosine functions by: 
sin x = Im (F(x)) 


i 7 , " 
= Im p> ai i) 


ΙΝ b a Sg 

xXx 37 51 eee 
pas (—1 2nt+1 
= Στ ‘3 


cos x = Re (F(x)) 
= Re ὦ ΕἾΝ 


n=0 ἢ 


eos. @ 


These functions are of class C” and 
D(cos) = —sin 
D(sin) = cos. 
Each of these functions satisfies the differential equation 
(4.9) 7 τῦ.-0. 
Furthermore, any f which satisfies (4.9) on an interval is a linear combina- 
tion of sin and cos. The proof of this should be very familiar to every 


serious student of mathematics. For convenience, suppose / is real-valued 
and (4.9) holds on a neighborhood of 0. Let 


g(x) = f(x) — f() cos x — f’(0) sin x. 
Then g’’ + g = 0 and g(0) = σ΄) = O. Thus, 
35 ee = 0 
D((g')? + 8?) = 0 
(g')? + g* = constant. 
Since g(0) = g’(0) = 0, the constant is 0; so, 
(6) Ἐξ ξΞυ 
ρ΄  =g=0 (since g is real-valued) 
f(x) = [(0) cos x + f’(0) sin x. 
EXAMPLE 3. Let f be a real-valued function on an interval. We call f 
a convex function if 


(4.10) flex + (Ι — Ομ] < f(x)  ( — ¢)f@) 
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FIGURE 16 


for all x, win the domain and all c such thatO - c< Ι. This says that, 
between x and u, the graph of f lies below the line joining the point 
(x, f(x)) to the point (u, f(u)). (See Figure 16.) Thus f(x) = x? and f (x) 
— e* define convex functions on the real line, while f(x) = sin x is a 
convex function on the interval [—z, 0]. 

The defining property (4.1) can be rewritten in another useful way by 
observing that if x < t <u, then t = cx + (1 — c)u where 


“= t 
u—x 


C= 
The condition (4.10) then says that 
ft) < ἘΞ το + sf 


u— x 


which is more conveniently expressed as 


Ζῶ -- f@) — fH = Se), 
ἔ τὺ χ ἐσ u—x 
In other words, if x < t < u, the chord joining (x, f(x)) to (u, f(u)) has 
slope at least as great as the chord from (x, f(x)) to (¢, f(¢)). This suggests 
that for smooth functions convexity amounts to the fact that the derivative 
Γ΄ is an increasing (non-decreasing) function. This is the case, and we have 
left the proof to the Exercises. 
As a matter of fact, the convexity property alone implies that f has 
a derivative at most points and that (where defined) /’ is an increasing 
function. We sketch a proof. We know that if x < ¢ < u, then 
fu) — fo) ~ fO = 70). 
hi Xx " PX 
Thus, unless x is the right end point of the interval of definition, the right- 
hand derivative (D*f)(x) exists and it is given by 


(D*f)(x) = inf LO — ΠΟ, 
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Similarly, unless x is the left end point, the left-hand derivative exists and 
(D-f (x) = sup LO— SO). 


Since both (D*f)(x) and (D~f)(x) exist where they make sense, f is con- 
tinuous at each such point x. More than that can be said. A little thought 
shows that 


DNOS=ONOSDNO=ONO, t>-x. 


So, the left derivative D-f is (as is D*f) an increasing function. At any 
point where f fails to be differentiable, (D*f)(x) > (D7f)(x) and so Df 
has a jump. The sum of all those jumps cannot exceed (D-f )(b) — (D*f )(a), 
on the interval [a, 6]. Therefore, there can be only a countable number of 
jumps. Conclusion: A convex function fon an open interval is continuous; 
it is differentiable except at a countable number of points; and the deriva- 
tive Df (where defined) is an increasing function. 


Exercises 


1. Suppose that F is differentiable at a point x. Near x we approximate F(t) 
by the affine function 


A(t) = F(x) + (t — xX) F(x). 
Show that the approximation satisfies 
lim £0 — A® _ 9, 


tox ie 


2. Let F be defined on an open interval about x. Prove that, if (D+F)(x) and 
(D-F)(x) both exist, then F is continuous at x. 


3. Let 
ae i 
x"sin—, x0 
f(x) = ~ 
0, x = 0. 
Prove that ἢ is continuous but not differentiable at 0. Prove that f, is dif- 
ferentiable but not of class C!. What is the corresponding fact about f,? 


4. Let f(x) = x"|x|,n =0, 1, 2,.... Compute /; and find the largest integer 
k such that f, is of class (ἢ. 


5. Let F be a differentiable function from an interval 7 into R” and suppose 
that F’ = 0. Prove that F is constant. 


6. Let f be the real-valued function on the real line 
f(x) = exp(—x"2), x40 
£() = 0. 


Prove that fis a function of class C” and that f“(0) = 0 for every n. 
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7. Let f be a complex-valued function on an interval such that f+ f” = 0. 
Find the form of f by using the fact that g = f— if’ and h =f + ff’ satisfy 
g’ = ig, π΄ = —ih. 


8. Let f, g be differentiable functions on R such that 


peer = A 
Df = g 
Dg = - Κ 
f@Q) Ξ 0 
g(0) = 1. 


Prove that f| g are what you think they are. 


9. Let f be the Cantor function on [0, 1], which was defined in Example 24 of 
Chapter 3. Prove that the derivative of f exists and is 0 at each point not in the 
Cantor set. Prove that fis not differentiable at any point of the Cantor set. 


10. Show that the exponential function on the real line is a convex function and 
hence that . 


ex ΞΡ +x —t) 
for all real numbers x and tf. 


11. Let f be a continuous real-valued function on an interval 7 which is dif- 
ferentiable at each interior point of 1 Show that f is convex if and only if the 
derivative f’ is an increasing (non-decreasing) function. (It is the “if” half of the 
result which we left unproved in the text.) What does this result tell you about 
the form of convex functions of class C2? 


12. If fand g are convex functions, is max (f, 5) a convex function? What about 
min (f, g)? 
13. Let f be a real-valued function on an interval. The following are equivalent. 


(a) fis convex. 

(b) For each x, there is a line through the point (x, f(x)) such that the 
graph of f lies above that line. 

(c) The set of all points in R? which lie above the graph of f is a convex set. 

(d) fis continuous and 


"(53 < Sire +0 


14. Let f be a continuous complex-valued function on the real line, which 
satisfies the functional equation 


f(x τ ἢ =f@)FO. 


If f 40, show that there is a complex number ἃ such that f(t) = e“. (You 
might ask first what kind of a function log | 7] is.) 


15. What about Exercise 14 for matrix-valued functions 7 


16. Let F be a differentiable function on the real line such that | F’| < [ΕἸ and 
F(0) = 0. Prove that F = 0. 
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17. We know how to solve the differential equation f’(x) = ax. Suppose we 
have a system of differential equations 


ft} =4iifi te: ah, 


fie = pi fy tee + One hy 
involving k unknown functions ἢ and k2 known constants a;;. Rewrite the sys- 
tem in the form 


F’ = AF 
where A is the matrix of constants and 
fi 
, -:Ξ 
Sie 


Prove that every solution for F has the form F(t) = e'4Xo, where Xp is some 
constant column matrix. 


4.2. Integration on Intervals 


Let F be a continuous function on a closed interval J = [a, b]. We 
want to define and discuss the integral of F over 7: 


[FH f F(x) dx. 


In case F is a non-negative (and continuous) real-valued function, this 
integral is a number which measures the area under the graph of F and it is 
defined by a process known as integration, which gives a systematic 
method for approximating the area. 

Since these ideas were first exploited systematically by Newton and 
Leibnitz 300 years ago, the range of applications of integration has been 
broadened enormously. This has been made possible in part by the modern 
definitions of the integral, in which the integration process is described in 
precise analytic terms, without appeal to any of its particular applications. 
In this chapter, we shall use a modified version of the formulation of 
integration given by Riemann. In Chapter 7 we will discuss the concept of 
integral given by Lebesgue, which is more general than Riemann’s because 
it allows one to integrate many more functions. But we have no need of 
such generality at this point because our primary goal is to understand the 
basic theorems of calculus. Furthermore, it is important for every student 
of analysis to know that Riemann’s process of integration converges for 
continuous functions. 

In order to present the Riemann integral it is convenient to introduce 


Sec. 4.2 Integration on Intervals 129 


two terms. A partition of the interval [a, b] is any (n + 1)-tuple of the form 
P= (X,.-- 5 X,) 


(4.11) 
α Ξε χὺ <x, Ξ-Ξ ..: «- x, = |b. 


Such a partition divides the interval into 1 subintervals, and the greatest 
of the lengths of these subintervals is called the mesh of the partition 


(4.12) | Pll = max (x, — %-1) (I Sk <n), 


Suppose F is a function from the interval J = [a, Ὁ] into R”. The 
integral of F over J, if it exists, will be a limit of sums 


(4.13) S(F, P, T) = du, (Xe — χα Ὁ) (t,) 
where 
P = (Xo, ce eg x,,) 
is a partition of [a, b] and 
7 = (t,,..-5t,)s ty = [Xp-1 Xx] 


is a choice of points ἔκ, one from each subinterval defined by the partition. 
Such sums are frequently referred to as “Riemann sums”, although we 
Shall not use this terminology. The “limit” of these sums is to be taken as 
the mesh of the partition goes to 0; loosely speaking 


F= lim S(F, P, T). 


I IPI] +0 
There are so many (uncountably many) different partitions and different 
choices of points at which to evaluate the function that we will need to be 
careful how we describe the “limit” involved. 

For any function which is integrable by the process we shall describe, 
it will always be possible (theoretically) to calculate the integral using any 
particular sequence of partitions {P,} for which the meshes tend to zero 
and any corresponding sequence of choices T,. Any two such sequences 
will provide a sequence of vectors (approximating sums) 


S, = SC, P,, Τῷ 
which converges to the integral of F. For example, if 
(4.14) ΤΟΣ, Ὁ πε. Ρ( -.ὴ 
ἕξι π n 


then for each integrable function F we shall have 
b 
lim S, = { F(x) dx. 


One might ask, therefore, why we do not use the sequence in (4.14) to 
define the integral, since it seems so much simpler than involving all sorts 
of partitions P and choices 7, as in (4.13). There are two reasons, one 
practical and one theoretical. 
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(i) Even for very smooth functions F the particular sequence (4.14) 
may not provide the most convenient way of calculating the integral. We 
want to be able to use any sequence of partitions with meshes tending to 0 
and know that the sums converge to the same limit. In other words, we 
want to be able to choose the sequence most convenient for each particular 
function F. 

(11) We need to deal with integrals of some functions which are not 
smooth and, therefore, we must be precise about the class of functions to 
which the integration process applies. Simple examples show that, were we 
to have the definition of this class of functions depend upon the values of 
the function at a particular sequence of points such as (k/n)(b — a), we 
would generate chaos. 


EXAMPLE 4. Let f be the real-valued function defined by 


ΤΣ Ξ 


0, if x is rational 
1, if x is irrational. 


Suppose a < ὃ and let P = (Xo, x;,...,%,) be any partition of [a, ὁ]. 
Each interval [x,_,, x,] has positive length and, accordingly, it contains 
both rational and irrational points. For each k, choose a rational point 
t, © [x,_,, x,] and an irrational point ¢, € [x,_,, x,]. From the definition 
of f we have f(t,) = 0 and f(t,) = 1 for each k. Accordingly, 


S(F, P, T) = 0 
S(F, P, T!) = ὃ — a. 


Thus we can choose sequences (P,,, T,,) with || P,,|| > 0 for which S(/, P,, T,,) 
= 0 and we can choose other sequences of this type for which S(F, P,, T,,) 
= b — a. This leaves us a bit perplexed as to what the integral of f ought 
to be. In order to avoid this confusion, we shall define the class of func- 
tions which are “integrable” in such a way as to exclude this function Καὶ 


Definition. Let F be a function from the interval [a, b] into R™. We say 
that F is Riemana-integrable if there exists a vector S in R™ with this prop- 
erty: For each € > QO there exists 6 > 0 such that if P is any partition of 
[a, Ὁ] with ||P||< ὃ and if T is any choice of points in the subintervals 
defined by P, then 

IS — SF, P, Τὴ] < e. 


Although this definition sounds somewhat complicated, it expresses 
a rather simple idea: The function F will be called Riemann-integrable if 
there exists a vector S (which will be the integral of F) to which the approxi- 
mating sums S(F, P, T) converge as || P|| > 0. Converge in which sense? 
In the sense that S(F, P, T) will be close to S provided only that || P|| is 
small, and with no condition on 7, the choice of points in the subintervals 
of P. 
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It should be clear that this basic definition can be reformulated as 
follows. The function F is Riemann-integrable if (and only if) there exists 
a vector S in R™ such that 


lim S(F, P,, T,) = S 


for every sequence of partitions P, with || P,,|| > 0 and any corresponding 
choice(s) Τ᾽, of points in the subintervals defined by P,. Thus the point of 
Example 4 can be described by saying that the function f which occurs 
there is not Riemann-integrable. 

If F is Riemann-integrable, the vector S which occurs in the definition 
is obviously unique. It is called the integral of F over the interval J = [a, b] 


and is denoted by Ἰ: F or [: F(x) dx. 


A word about this notation. Part of the power of calculus as a tool 
derives from the very useful notation which Leibnitz invented for deriva- 
tives and integrals. For real-valued functions on an interval [a, δ] the 
integral was denoted by 


[7 dx 


as a reminder of the process of integration 


[ £@) dx = | lim Σ Sem = 555) ) 


IP\|-0 k 


= lim SY f(x) Ax 


maxAx,-0 &k 


= jim > f(x) Ax. 


We have not used the Ax, = x, — x,_; notation, in part because for 
vector-valued functions it would then be natural to write 


lim > Ax, F(x;) = | ᾿ dx F(x) 
IPII+0 k=1 a 
which is potentially confusing. 

We want to direct our efforts immediately to proving that various 
well-behaved functions (e.g., continuous functions) are Riemann-integrable. 
In other words, we want to show that the process of integration “con- 
verges” for various well-behaved functions. We need a criterion for con- 
vergence analogous to the Cauchy criterion for sequences, i.e., a criterion 
which does not explicitly involve the limit vector. 

If F is any function from fa, b] into R”, then, for each ὃ > 0, we 
define the set 2; = X,(F) as follows. We consider all partitions P = 
(Xo,...,%,) Of the interval [a,b] such that || P||< 6 and all possible 
corresponding choices T of points t, € [x,_,, x,]. Each such pair (P, T) 
determines a sum 


S(F, P,T) = be — Xy-1)F (ty) 
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and 2,(F) is defined to be the set of all such sums: 
(4.15) »» (F) = {S(F, P, T); || P|| < 6, t, © [xn-1, x;]}. 


If F is Riemann-integrable then, for small 6, all the sums S(F, P, 7) 
with || P|| < ὃ, i.e., all the vectors in Σ,(), will be close to the integral 
of F. Thus, when 6 is small, 2, must be a set of small diameter clustered 
near the integral of F. And the integral is approximable by vectors in Σ;, 
that is, the integral lies in the closure £,. What happens to the set X, 
as we decrease 6? If 0 < 9 < ὃ, then it is immediate from the definition 
(4.15) that &, < Σς. This is true for any F. For any integrable F it must 
be that, as ὃ tends to 0, the closed sets 5, close down on the single vector 


Ϊ ; F. In other words, the diameters of the sets Σς converge to 0 and [, Ε 
is the single vector contained in the intersection 


ΓΥΣ: 
d>0 

Theorem 2. Let F be a function from the interval [a, b] into R™, and 
define the set Σ,(Ε) by (4.15). Then F is Riemann-integrable if and only if 
(4.16) lim diam ZX; = 0. 

6-0 

Proof. It is the “if” statement which remains to be proved. Assume 

(4.16) and consider the intersection of the closures of the sets Z;: 
Γ Xs. 
6>0 
By the condition on the diameters, there cannot be more than one vector 
in this intersection. Furthermore, the intersection is nonempty for this 
reason. We know that Σ, < Σ; if ἡ < ὃ. Hence the closed sets 2, are 
nested: ᾿ Ε 
ΣΟ ye η -: ὅ. 

Furthermore, since the diameters of these sets tend to 0, they are bounded 
(sets) for all sufficiently small 6. Thus their intersection is non-empty. (See 
Theorem 9 of Chapter 2.) Let S be the single vector in the intersection. If 
€ > 0, choose ὃ > 0 such that diam &,(F) < ε. If P is any partition of 
[a, Ὁ] with || P||< ὃ and if T is any choice of points in the subintervals 
defined by P, then the sum S(F, P, T) is in the set Σ,(Ε) and the vector S 
is in X,(F). Since diam 2,(F) < ε, 


|S — S(F, P, ΤῊ < ε. 
Thus F is Riemann-integrable (and S = 2 F). 
In order to use the diameter criterion (4.16) to show that a given 


function F is Riemann-integrable, one needs to estimate, for a given ὃ, 
what the maximum distance is between any two approximating sums, 
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formed by partitions of mesh at most 6. To carry out such estimates, we 
need to see what happens when we pass from a given partition to a more 
refined one. 

Let P = (%,...,x,) and QO =(yo,...,y,) be partitions of the 
interval [a, δ]. We say that Q is a refinement of P if {x,,..., Χ,} 15 a subset 
of {yo,..., ¥,}. In other words Q is a refinement of P if Q can be obtained 
by starting with the points x,» < x, <---< x, and interposing some 
p — n points between them to form a finer subdivision. The fact that Q is 
a refinement of P guarantees among other things that ||Q|| < || P||, but 
note that it is a much stronger condition than the (mere) fact that QO has a 
smaller mesh than does P. 


Lemma. Let F be a function from [a, Ὁ] into R™. Let P = (Xo,..., X,) 
and Q = (γ0: . . . . Υρ) be partitions of [a, Ὁ] such that Q is a refinement of 
P. If T is any choice of points t, © [Xy-1, Χμ], Κ = 1,..., ἢ, there exists 
points U,,..., U, in [a, Ὁ] such that 


(i) y;-1 —||[Pll<u;< y,; + ||Pll, j=l,...,D3 
ee n Ρ 
() Σ (x, — X,-i)F(t,) = »2 (y; — y;-1) F(u)). 


Proof. Letj be an index, 1 <j < p. We shall define the corresponding 
point u,;. Since Q is a refinement of P, none of the points x, lies in the open 
interval (y,_1, y,;). Therefore the closed interval [y,_,, y,] must lie wholly 
within one of the intervals [x,_,, x,]. In fact, since the interiors of the dif- 
ferent intervals [x,;, x,] do not overlap, there is precisely one index i, 
1<i< a, such that [y,_,, y,] < [x,_;, x,]. Call this index i(7) and define 
U; = by;). 

Now observe that 


fo (¥; — yj-1)F(u;) 


Σ (y; ae y;-1)F(u;) => a 
-Σ ΣΙ υ; ra ¥;-1)FC,). 


For a particular k, the indices 7 such that i(j) = k are those for which 
[¥;-15 Vy] — [Xx-1, Χε]. Since Q is a refinement of P, these intervals form a 
partition of [x,_,, χα], that is, 

>; (Y; — »,».1) = χὰ -- Xue. 

i(j)=k 


This proves the lemma. 


Theorem 3. If F is a continuous function from the closed interval [a, b] 
into R™, then F is Riemann-integrable. 


Proof. The idea of the proof is to use the uniform continuity of F 
to estimate the diameters of the sets X,(F). Let 6 > 0. Let P, P’ be parti- 
tions of [a, δ] of mesh at most 6 and let T = (¢,,...,7,), T’ = (t1,...,t,) 
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be choices of points in the subintervals determined by P and P’, respec- 
tively. We want to estimate the distance from S(F, P, T) to S(F, P’, T’). 
To do this, we need to rewrite them as sums over the same partition. 
Accordingly, let Q = (yo,.-..,y,) be the partition which is made up of 
the distinct points in the sequence X9,...,Xn,, Xo,.-->,X,, arranged in 
increasing order. Then Q is a refinement of P and also a refinement of P’. 
By the preceding lemma, there exist points u,,...,u, and u),...,u, in 
[a, δ] such that 
S(F, P, T) = S(F, Q, UV) 
S(F, P’, Τ᾽) = S(F, Q, UV’). 

We do not know that u, and uw; lie in the interval [y,_;, y,]. But, from the 
lemma, we do know that both u, and uj lie in the interval [y,_, — ὃ, 
y; + ὃ]. Thus, 


S(F, P, ΤῊ — SUF, P’,T') = Ὁ» — γ ὑπ) — FDI 
and ju, — u) |< 26, 7 =1,...,p.-Let 
M = max_| F(u,) — Ῥω) 
and we have 
|S(F, P, ΤῊ — S(F, Ρ', ΤΉ << ΜῈ (9, -- "Ὁ 
= M(b — a). 


The idea now is to show that M is small whenever || P|| and ||P’ || are 
small. The critical observation 1s that F is continuous on the closed interval 
(a, b], and is therefore uniformly continuous, (Theorem 12 of Chapter 3). 
This guarantees that, when ὃ is small, each of the numbers | F(u,;) — F(u})| 
will be small, because u,, μ᾽ are not more than 26 apart. To be precise, if 
q@ 15 the modulus of continuity of F (see (3.17)): 


oF, ἢ) = sup | Fx) — FO| 
then (4.17) tells us that 
| S(F, P, ΤῈ) — SF, P’, T’)| < ὦ — aol F, 26). 


(4.17) 


Thus, 
diam 2,(F) < (6 — a)o(F, 20). 
Now 
lim w(F, 4) = 0 
4-0 
because this is what it means to say that F is uniformly continuous. We 


conclude that F is Riemann-integrable. 


Corollary. If F is a continuous function on the interval [a, Ὁ], if P = 
(Xo,..-, Χρ) ἰδ a partition of [a, Ὁ] and T = (t,,..., t,) any choice of points 
t, © [x,_,, x,] then 
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[Fe dx — Σ ἃ -- DFC) | < ὦ — ao, 2} ID 


=1 
where q@ is the modulus of continuity of F. 
Proof. Let ὃ = ||P||. In the proof of the theorem we verified that 
diam 2,(F) < (ὁ, aja(F, 20). 


Since the integral of F is in the closure of £,(F), its distance from the sum 
S(F, P, T) cannot exceed (ὁ — a)w(F, 20). 


Exercises 


1. Let F be a Riemann-integrable function on the closed interval 7. Show that, 
if c is any real number, then cF is Riemann-integrable and 


cF=c Ϊ Ε. 
] I 
(Work directly with c [ F — S(cF, P, T) rather than £,(cF).) 


2. In Exercise 1, if F is a complex-valued function and c is a complex number, 
is the same assertion valid? (Of course, we regard a complex-valued function as 
a map into R2.) 


3. Let F and G be functions from the closed interval J into R”. Show that, if F 
and G are Riemann-integrable, then F + G is and 


{e+ Gy=|[ F+] 6. 


4. If fand g are real-valued Riemann-integrable functions on the closed inter- 
val I such that f> g, then 
[ 7." i) g. 
I I 


5. If fis a non-negative continuous function on [a, δ] and 


. ii f(x) dx =0 
then f = 0. 


6. Use the result of Exercise 5 to prove the following. If fis a continuous real- 
valued function on [a, δ] and 


[ fix) dx =0 


then there exists a point c, a<. c <b, such that f(c) = 0. 


7. Use the result of Exercise 6 to prove the first mean value theorem of integral 
calculus: If fis a continuous real-valued function on the interval [a, δ], there exists 
a point c, a<c <b, such that 


{πὸ dx = ὁ -- Αγ. 
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8. Give an example which shows that the mean value theorem of Exercise 6 is 
false for (some) complex-valued continuous functions. 


9. Let fbe a real-valued continuous function on [a, ὁ] such that 
(Ὁ | fj <1; 
b 
(ii) Ϊ fx) dx =b —a. 
Prove that f = 1. 
10. Prove Exercise 9 for complex-valued continuous functions Κὶ 


11. Let fbe a complex-valued continuous function on a closed interval J such that 


[f= [V1 


Prove that fis real-valued and f > 0. 


12. Prove the Cauchy-Schwartz inequality for integrals of continuous func- 
tions: If f, g are continuous real-valued functions on [a, δ], then 


(| se) <= [, γ: [ 8. 


4.3, Fundamental Properties of the Integral 


We shall summarize the properties of integration which are used 
repeatedly. 


Theorem 4. Let F be a function from the closed interval 1 = [a, Ὁ] into 
R™ and suppose F is Riemann-integrable. 


(i) (Positivity). Jf F is non-negative real-valued, i.e., if m= 1 and 
F > 0, then 


ees) 


(ii) (Linearity). Jf G is another Riemann-integrable function from 1 
into R™ and if c is any real number, then (CF + G) is Riemann-integrable 
and 


| @F+ G)=c| F+]{ G. 
(11) (Boundedness). The function F is bounded and 
fr F(x) dx 


(iv) The function |F| is Riemann-integrable and 


sles 


< (Ὁ — a) sup [F(x)|. 
a<x<b 
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Proof. (i) and (ii) were to be verified as part of the exercises for 
Section 4.2. They are repeated here for emphasis. 
(iii) Let P = (xo,... , X,) be any partition of [a, b] and T= (t,,...,¢,) 
any choice of points ἐκ Ε [x,_1, x,|. The sum 
S(F, P, T) = Σ (χὰ — X,-1)F (ty) 
satisfies 
(4.18) | SCF, P, T)|< Dy Oe — Xy-1) | Ε(2)}. 
If 
M = sup |F(x)| 
σξχϑξσ 
then it is apparent from (4.18) that 
|S, P; T)| = M > (Xe " Χκ- 1) 


= M(b -- a). 


Since this is true for every sum S(F, P, T), we also have 
b 
{ F(x) dx| < M (b — a). 


But how do we know that M < oo, Le., that Fis bounded? Suppose that 
F is not bounded. Let P = (x, ..., X,) be any partition of [a, b]. Then 
there must be (at least) one index k such that F is not bounded on the 
interval [x,_,, x,]. Fix such a value of k, and select a sequence of points 
{c,;} in the interval [x,_,, x,] such that 


(4.19) lim | F(c;)| = ©. 

j 
Let T, be the choice of points in the subintervals defined by P for which 
t; = Xi, 1:35 k, and t, — Cj. That 1S, let T; = (x1; δι.» γὼ eS Pe Cj, Apads eo «65 
x,). Then 


S(F, P,T,;)= 24 (Χι — X;-1) F(x) + Oe — Xn-1)F(C;). 
From (4.19) we see that 
lim | S(F, P, T,)| = οο. 
j 


Since there is such a sequence {T,} for each partition P, the set 2,(/) 15 
unbounded, for every 6 > 0. Thus, F is not Riemann-integrable. 
(iv) The inequality 


> (xX, -- χορ Ὁ) F(t) |< 3 (x, — Xp-1)| F(t,)| 


states that 
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for all partitions P and choices 7. Thus, it is apparent that 
4.2 
(4.20) {,2|Ξ f iF 


if|F|is known to be Riemann-integrable. For instance, if F is continuous 
we know that | F'| is continuous and therefore Riemann-integrable; hence 
the norm inequality (4.20) holds. For most of this book the functions 
which we integrate will be piecewise-continuous (defined later) and it is 
a trivial fact that if F is such a function, then | F| is as well. Thus on first 
reading one should omit the proof that the integrability of | F| follows 
from the integrability of F. It is given in the next section, where we discuss 
some technical facts about Riemann-integrability of real-valued functions. 


Theorem 5. Let F be a function from the interval [a, Ὁ] into R™ and let c 
be a point such that ἃ < ο < Ὁ. If F is Riemann-integrable on the interval 
[a,c] and on the interval [c, Ὁ] as well, then F is Riemann-integrable on 
[a, b] and 


Ι F(X) dx = [Ἐρ) dx + [ F(x) dx. 


Proof. The idea here is relatively simple, although it does take a few 
lines to write down the details. Let P = (xo,..., x,) be any partition of 
[a, δ]. Then there is a unique index k suchthat x,_, « ὁ -Ξ x,. Accordingly, 
Ο = (X%o,.-., Xz-1,€) and R= (c, x,,...,X,) are partitions of [a, c] 
and [c, δ], respectively. Let T= (¢,,..., t,) be any choice of points t, € 
[x;-1, χῇ. Then U = (t,,...,¢t,-1,c) and V=(c,t,,...,1¢,) are choices 
of points within the subintervals of Q and R. Now . 


k~-1 


S(F, Q,U) = Σὲ (ἱ — χ ) FG) + (ὁ — χα ἡ FC) 


i=1 


n 


S(F, Καὶ, V) = py (x; — X,-1) F(t, + (ας — OFC) 
S(F, P, T) = a (x; — X,-1) FC) + (χκ — Xe-1) FG). 
Therefore, 
S(F, P, T) = S(F, O, VU) + 5(8 R, V) 
+ (χὰ -- Xe-1 NF (,) — Ε(Ο)]. 
The idea is that, when || P||is small, the first term on the right will be near 


[: F, the second will be near [ ᾿ F, and the third will be small. To be precise, 
let € > 0. Since F is integrable on [a, c], there exists ὃ > Ὁ such that 


(4.21) 


(4.22) 


{ F(x) dx — S(F,Q, U)|\ < + 


for all partitions Q of [a, c] such that ||Q||< ὃ (and for all choices U). 
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Similarly, there exists 4 > 0 such that 


δ 
(4.23) | F(x) dx — S(F, R, V)| < = 
for all partitions R of [c, δ] such that || R|| << ἡ (and for all V). Now 
(4.24) Ια — χα ΠΕ) — FOI) < 21 Q || 4 
where 


M = sup |F(x)|. 
σα χΞ 
Let P be any partition οἵ [a, δ] such that 


. ε 
and let T be a corresponding choice of points t;. Use the point c to con- 
struct the corresponding Q, R, U, and V. Then (4.22), (4.23) are applicable 
and the term on the right side of (4.24) is less than €/3. From (4.21) we have 


[ F(x) dx + [ F(x) dx — S(F, P, Τὴ «-ε. 


It follows that F is Riemann-integrable on [a, δ] and that its integral is the 
sum of the integrals on the two subintervals. 


It is conventional and extremely useful to define 
[τὼ dx = -- [Ερ) dx 
a b 


whenever 6 < a and F is integrable on [b, a]. The addition formula 


it a 


is then valid with a, ὃ, c in any order, as long as the function is integrable 
on the largest of the three intervals. 


Definition. Let F be a function from the closed interval [a, Ὁ] into R™. 
We say that F is piecewise-continuous if there is a partition P = (Xo,..., Xz) 
of {a, b] such that F is uniformly continuous on each of the open intervals 
(X;-1, X;) i = 1, ce ey n. 


Theorem 6. Every piecewise-continuous function on the interval [a, b] 
is Riemann-integrable. 


Proof. Let F be piecewise continuous and let P = (xo,...,x,) bea 
partition of [a, b] such that F is uniformly continuous on each subinterval 
(X,-1, X,). By Theorem 5, it suffices to show that F is Riemann-integrable 
on each subinterval [x,_,, x,]. Fix an index k. Since F is uniformly con- 
tinuous on (x,_;, X,) we can extend F to a continuous function on the 
closed interval [x,_,, χα], that is, there is a continuous function G on 


139 


140 


Calculus Revisited Chap. 4 


[Xx-1, X,] such that G(x) = F(x), x,-1 <x < x,. We know that G is 
Riemann-integrable and, since F(x) differs from G(x) for at most two 
points x Ε [x,_,, x,], F is Riemann-integrable. (See Exercise 2.) 


Theorem 7 (Fundamental Theorem of Calculus). Let F be a function 
from [a, b] into R™ which is Riemann-integrable. Let G be the function 
defined by 


G(x) = J. Fqu) du, a<x<b. 


Then G is a continuous function on [a, Ὁ]. If x is any point of [a, Ὁ] at which 
F is continuous, then G is differentiable at x and G'(x) = F(x). 


Proof. We have 
GQ) — G(x) = | Fu) du + {° F(u) du 


2 ii F(u) du. 


Thus, 
|G@ — G@)|<|¢— x] sup [δ]. 


Since F is bounded, this shows (directly) that G is uniformly continuous. 
Now, let x be a point at which F is continuous. We have 


a a εἸςς | ᾿ F(u) du. 


ΐ--χ ΐ 
Therefore, 


GO — GO) _ F(x) = —_ Ι F(u) du — F(x) 


= —. | | ᾿ (F(u) — F(x)] du 
and 


425 [SO — GO) — κὸ < |e — x 


[ [F(u) — Ε(ΧῚ] du}. 


How large can the last integral be? Not larger than [1 — x|@(x, |x — ¢)), 
where 

co(x, 6) = sup {| F@) — ΣΟΊ]: |u — χ < δ). 
From (4.25) 


(4.26) Go) — CD) - F(x)| <x, |x — t)). 


Since F is continuous at the point x, 


lim @(x, 6) = 0. 
6-0 
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From (4.26) we see, therefore, that 


f= x 


tx 


i.e., we see that G is differentiable at x and G’(x) = F(x). 


Corollary. Let 1 be any interval on the real line which has a non-empty 
interior and let F be any continuous function from 1 into R™. There exists 
a function G from I into R™ such that (G is differentiable and) 


G' = F. 


Proof. The only purpose in assuming J has a non-empty interior is to 
avoid wondering what derivatives might be on degenerate intervals, 1.6., 
points. Other than for this condition, 7 may be any type of interval: open, 
closed, semi-closed, bounded or unbounded. For the proof, let a be a 
point in the interior of 7 and define 


G(x) = " F(u) du, xel. 


Corollary. Let F be a piecewise-continuous function from the closed 
interval [a, Ὁ] into R™. If G is any continuous function from [a, Ὁ] into R™ 
such that G'(x) = F(x) except at a finite number of points x, then 


f ” F(x) dx = G(b) — G(a). 
Proof. Define 
H(x) = G(a) -- G(x) + [ ” F(u) du. 


Then H is continuous on [a, b] and, except at a finite number of points, 
its derivative exists and is 0. Therefore H = 0. The corollary says that 
H(b) = 0. 


This last corollary is also referred to at times as the fundamental 
theorem of calculus. It is the result of combining Theorem 7 with the 
fact that a differentiable function on an open interval which has derivative 
0 is constant. It is a powerful tool which says, for example, that if F is 
continuous and we either know or can dig up a function G such that G’ = 
F, then we can evaluate the integral of F over any closed interval within 
its domain by subtracting the values of G at the end points. All of this is 
quite familiar to the reader, of course. We repeat it here to emphasize the 
beauty and power of this result relating differentiation and integration. 
Once we know the derivatives of x"*!, sin x, cos x, e*, tan”! x we can 
reduce the problem of calculating 


b δ ὃ δ δ 1 
i: dx, [60s x dx, | sinxax, [ dx, | ore 
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to calculating the values of the “antiderivative” function at b and a. It is 
worthwhile for the reader to pause a few moments to be impressed once 
again with such formulae as 


x ὃ 1 
eee ee ae : π᾿ | 
sin x [00s +a, e e | « dt, Ζ -ἰ pee 


and their interpretations in terms of areas. 

The first corollary is unquestionably one of the most fundamental 
theorems in mathematics: If F is continuous on an interval, the differential 
equation G’(x) = F(x) has a solution G. If we specify the value of G at 
some point x,, then the G is unique. The process of integration describes 
how to go about calculating the value of this G at any given point. 


Theorem 8 (Change of Variable Theorem). Let g be a real-valued func- 
tion of class C! from the interval [c, d] into the interval [a, Ὁ] such that a = 
g(c) and Ὁ = g(d). If F is any continuous function from [a, Ὁ] into R™, then 


{᾿ Feo dx = f° e(OF@@) At. 


Proof. Let G(x) = [ F(u) du. Then G is differentiable and G’ = F. 


Thus the composition H(t) = G(g(t)) is a function of class C! on the 
interval [c, d]. Furthermore, the chain rule tells us that 


Η (ἢ = g'(OG(g() 


= g'(1)G(g). - 
Thus 
[ s@OF@@) a = | Hat 
= H(d) — H(c) 
= G(b) — G(a) 
= { F(x) dx. 
Exercises 


1. If F is a piecewise-continuous function on [a, b] and 
δ 
[{[Ἴ ΕΟ] ax =0, 


what can you say about F? 

2. If Gis a Riemann-integrable function on [a, b] and F is a function such that 
F(x) = G(x) except at a finite number of points x, then F is Riemann-integrable 
and 


[᾿ F(x) dx = [ G(x) dx. 
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3. Show that 
[7 e' ax = ee oe ee 
4. Show that for any fixed a, ὃ 
lim ἜΝ dt =0 
5. Let f be the function on the real line defined by 


0, if x is irrational or zero 


foo} 


» ae = (lowest terms). 
q q 


(The second description means: If x is rational and non-zero, write x = p/q 
where p is an integer, g is a positive integer, p and g have no common factors 
other than -++-1; then define f(x) = 1/q.) Prove that fis Riemann-integrable on 
every interval [a, b] and has integral zero. 


6. Let f be the characteristic function of the Cantor set K 
1 χεᾷκ 
0, x ¢ K. 


(The Cantor set is Example 24 of Chapter 3.) Prove that f is Riemann-integrable 
on the interval [0, 1] and 


γὼ = { 


{70 dx =0. 


7. Let F be a function which is bounded and continuous on the open interval 
(a, δ). Prove that F is Riemann-integrable on [a, ὁ]. (According to Exercise 2, it 
doesn’t matter how F(a) and F(6) are defined.) 


8. Let {x,} be a convergent sequence of (distinct) points in the interval [a, 6]. 
Let F be a function from [a, δ] into R™ such that 


(i) F is bounded; 
(it) F(x) = 0 except (possibly) at the points x,, x2, . 


Prove that F is Riemann-integrable and 
ὃ 
[ F(x) dx --Ο. 
9. Let A be a fixed k x k matrix and define 


F(t) = e'4, te R. 


Apply the fundamental theorem of calculus to obtain the function G with G’ = F. 
Wouldn’t you guess that G is given by 


G(t) = A7e!4? 
What if A is not an invertible matrix? 


10. If fis a real-valued Riemann integrable function on [a, δ] and Fis a Riemann- 
integrable function from [a, b] into σι then fF is Riemann-integrable. 
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4.4. Integrability and Real-Valued Functions 


In this short section, we reformulate the criterion for Riemann- 
integrability of real-valued functions, along the lines that Riemann 
himself followed. The essential point is that the ordering of the real num- 
bers allows one to test the integrability of a real function by comparing 
sums S(f, P, T) for different choices 7 but the same partition P. In fact, 
as the mesh of the partition P decreases to 0, one obtains for each bounded 
function an upper and a lower integral to compare. These will be equal if 
and only if the function is Riemann-integrable. 


Definition. Let f be a bounded real-valued function on the interval [a, Ὀ], 
and let P = (Xy,...,X,) be a partition of [a,b]. The upper and lower 
Riemann sums associated with f and P are 


S(f, P) = Σ Μκίχ, — Xy-1) 


S(f, P) = > mi(X~ — Xy-1) 


where 
M, — sup f(x) and m, ΞΞΞ- inf f(x), I, —= [Xx 1» X;]. 
ΧΕΙ, xé€l, 


Note that the suprema and infima involved in these definitions are 
finite because we have assumed that f is bounded. One purpose in intro- 
ducing the upper and lower Riemann sums should be apparent: If T= 
(t,,...,¢,) is any choice of points t, Ε [x,-1, x,], then 


Sf, P) < SU, P, T) < S(f, P) 


and the gap between S(f, P) and δύ, P) indicates the range of values for 
the approximating sums derived from different choices 7. Thus the sizes 
of δ, P) — S(f, P) for the various partitions P of small mesh provide 
some measure of the integrability of f. The theorem below will show that 
the Riemann-integrability of f can indeed be measured in this manner. We 
urge the reader to draw a picture for the case in which f > 0, to see that 
S(f, P) and S(f, P) are the upper and lower estimates which P provides 
for the area under the graph of f and to keep this picture in mind through- 
out this section. 


Lemma. Let P and Q be partitions of [a, Ὁ] such that Q is a refinement 
of P. Then 7 7 
S(f, Q) < Sf, P) 
S(f, Q) > Sf, P). 
Proof. We leave the proof as an exercise, modulo this remark. Each 
subinterval J defined by Q is contained in precisely one subinterval J 
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defined by P and 
sup f< sup f 
I J 


inf f > inf f. 
I J 


Lemma. If P and P’ are any partitions of [a, b], then 
S(f, P) < S¢f, P’). 
Proof. Let Ο be the common refinement of P and P’ obtained by using 


the distinct points of subdivision provided by P and P’ jointly. By the 
lemma above applied to (P, Q) and (P’, Q), we have 


S(f, P) < δύ, OQ) < SUF, Q) < SUF, P’). 


Definition. If f is a bounded real-valued function on the interval [a, Ὁ], 
the upper and lower (Riemann) integrals of f are, respectively, 


U(f) = inf S(f, P) 
P 
L(f) = sup S(f, P) 
P 
where in each case P ranges over all partitions of the interval [a, Ὁ]. 


By the last lemma, we have 


19) < U(f) 


for every bounded function /. If fis Riemann-integrable, then clearly 
b 
{τῶ dx = L(f) = U(/). 


Theorem 9, Let f be a bounded real-valued function on the closed 
interval [a, Ὁ]. The following are equivalent. 


(i) f is Riemann-integrable. 
(ii) Uf) = L@f). 
(iii) For each € >0O there exists a partition P of [a,b] such that 
S(P) — S(P) < e. 
(iv) ὙΠ [S(P) — S(P)] = 0. 


Proof. (i) = (ii). Suppose f is Riemann-integrable. Let ε > 0. There 
exists 6 >0O such that the set £,(f) = (50, P, ΤΊ; ||P|| <6} has 
diameter less than ε. This certainly shows that U(/) — L({) < e. 

(ii) => (iii). Suppose U(/) = L(/). Let ε > 0. There exist partitions 
P, P’ such that 


S(P) < U(f) + 4 


W Ἢ) 


S(P) > Lf) -- ξ. 
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Then, since U(f) = L(f), we have S(P) — S(P’) < ε. Now, let Q be the 
common refinement of P and P’ which we have formed several times 
previously. Since S(Q) < S(P) and S(Q) > S(P’), we have S(Q) — S(Q) 
< €. 

(11) => (iv). Assume (iii), and let € > 0. Choose a partition Q = 
(Yo. --+>¥m) Of the interval [a, b] such that S(Q) — S(Q) < €/2. We will 
use this fixed partition Q to find 6 > 0 such that S(P) — S(P) < ε for 
all partitions P of mesh at most 6. The idea is this. Given any partition 
P= (Xo,...,X,), define P* to be the common refinement of P and Q. 
Let’s compare S(f, P) and S(f, P*). In the sum 


S(f,P) = Mile — χα.) 


every term will appear as a term in S(f, P*) except for those values of k 
such that some y, lands in the interior of [x,_,, x,]. There are at most 
(m — 1) such values of k; hence the sum of the terms M,(x, — x,_1) 
corresponding to such values of & cannot exceed 


(m — 1)M|| P|| 
where M = sup f. Consequently, 
{a, δ) 


δύ, P) < Sf, P*) + (m — 1)M|| P|. 
By closely analogous reasoning, 
S(f, P) > δ. ΡῸ — (m — 1)M || Pll. 
Now, let ὃ = €/4(m — 1)M. Then the inequalities above tell us that 


S(f, P) — δύ, P) < SUF, P*) — SU P*) + -5- 
Furthermore, P* is a refinement of Q, so that 
S(f, P*) — S(f, P*) < S(f, Q) — 50, Q) 


é 
<5: 


We conclude that δύ, P) — S(f, P) < ε, for all partitions P with || P|| 
-- ὃ. 

(iv) => (i). We need to show that (iv) implies that the diameter of the 
set Σ,() = (δ ἡ Ρ, T); || P|| < 6} tends to 0 as ὃ tends to 0. What we 
shall show is that 


ἢ diam Z,(f) < SUD [50] P) — SCF, P)] < diam £,(/). 


The right-hand inequality is trivial because 
diam ZS) = sup | S(f, Ρ, T) = δ, Q, U)| 


where the supremum is extended over all partitions P and Q of mesh at 
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most 6 and over all corresponding choices T and U of points in the sub- 
intervals defined by P and Q. 

To verify the left-hand inequality, let P and Q be any partitions of 
[a, δ]. By the last lemma, δύ, Q) — S(f, P) > 0 and hence 


SU, P) — SU, Q) < SU, P) — SUF, Q) + ISA Q) — SU, PI 
= S(f, P) — Sf, P) + S(f, Q) — δύ, Q) 
= [S(f, Q) — SCF, ΡῊ + [δ Q) — δ0 ΟἹ 
<2 sup [Sf P) -- SU, PII. 


Corollary. If f is a monotone increasing (or decreasing) function on the 
interval [a, Ὁ], then f is Riemann-integrable. 

Proof. Let P = (Xo, X;,...,X,) be any partition of [a,b]. Then, 
since f is an increasing function, for each subinterval L, = [x,_1, X;] 
we have 


sup J = f(x.) 
inf f = f(x,-1). 


Accordingly, 
S(f, P) = Σαραα — πιο) 
SCP, P) = Si Fes — He) 
and so 


δ0; P) — SF. P) = OU) — Fee — X-1) 
«ΠΡ DG) — fr 
-- ΠΡΙΠΖῸ) —f@). 


Hence it is apparent that 


lim [S(f, P) — δύ P)] = 0. 


I|Pi|—>0 


Corollary. If ἴ is a real-valued Riemann-integrable function on the 
interval 1 = [a, Ὁ], then |f| is Riemann-integrable and 


If f}<f ith 


Proof. We remarked earlier that the inequality on integrals follows as 
soon as | f| is proved to be Riemann-integrable. This proof consists of 
showing that 

S(f|,P) — SUF, P) < δύ; P) — SFP) 


for every partition P. This is easy to see because, on each subinterval 
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I, = [χκ- 1, x,] defined by P, 
ge ΠΠ Χ.5Ξ uD [FO] -lfO]) 
S sup |f — f(s)! 
= sup [f@ —f()] 
= sup f — inf f 


The critical step occurred between the second and third lines above, where 
we used the fact that f is real-valued and the consequent fact that 


If —f(s)| is either f() — f(s) or f(s) — f(. 


Corollary. If F is a Riemann-integrable function from the interval 
I = [a, Ὁ] into Ἀπ, then | F| is Riemann-integrable and 


[{π|Ξ {πΕι 


Proof. Once again, all we need to show is that |F| is Riemann- 
integrable. Let P = (xo,...,,) be any partition of [a, 6]. Then 


S(F|, P) — SUF | P) = 3 mee — X41) 


where 
ἥι = Sup [[Ἐ(Ὁ} -- [δ)}, Te = [χκοιν χα]: 


If -- ()1,..., Ym) and Z = (Ζ,,.... Ζ,.) are any two vectors in R”, then 
IZI—1¥1< Mlz,— yh 
Thus, if F = (f;,...,f,,), we have 
IFO] —1F()|< LIAO —LOL 


Therefore, 


Ne 33 510 Σ IO — fs) | 
ΩΣ 
= Σ sup (10 -- 70} 


and consequently, 
SUF |, P) — SUFI P) < Σ [50], P) — Sfp ΡῊ. 


Since F is Riemann-integrable, each ἢ, is; hence, it is apparent from the 
last inequality (and Theorem 9) that || is Riemann-integrable. 
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Exercises 


1. Complete the proof of the first lemma of this section. 


2. Prove that if the real-valued function f is Riemann-integrable on [a, δ] then 
f? is Riemann-integrable. 


3. Use the result of Exercise 2 to prove that the product of two real-valued 
Riemann-integrable functions is Riemann-integrable. 


4. If F and G are Riemann-integrable functions from [a, 6] into σι, then the 
function <F, ΟΣ is Riemann-integrable. 


5. Establish the result of Exercise 3 for complex-valued functions. 


6. A real-valued function is called piecewise-monotone on {a, 5] if there is a 
partition P of [a, δ], on each subinterval of which f is either non-decreasing or 
non-increasing. Show that every piecewise monotone function is Riemann- 
integrable. 


7. If fis a non-decreasing real-valued function on [a, δ], then 
40) = |" fae 
is a convex function on [a, 5]. 


8. If fis a non-negative Riemann-integrable function, then ./ f is Riemann- 
integrable. 
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4.5. Differentiation and Integration in ἘΔ" 


We shall not engage in a thorough discussion of either derivatives or 
integrals in m-space at this point. But there are a few basic facts which are 
used so often and which follow so naturally upon the 1-dimensional case 
that it would be artificial to delay their discussion. 

If we start from differentiation in Αἴ, one natural thing to consider is 
the differentiation of functions along lines in R”. Such derivatives measure 
rates of change in various directions. If we start from, say, the origin, the 
different (straight-line) directions in which we can strike out are represented 
by the rays which emanate from the origin. A convenient way to enumerate 
those rays is by the points of the unit sphere. Each ray emanating from the 
Origin contains exactly one vector V such that |V| = 1. 

Suppose that F is a function from (a subset of) R” into R”. Let X be 
a point in the interior of the domain of F. If |V| = 1, we can inquire 
whether the limit 

lim 
t—0 


exists. If it exists, we call it the derivative of F at X in the direction of the 


F(X + tV) — F(X) 
ΐ 
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(unit) vector V and we denote it by (D,F)(X). The point of insisting that 
X be in the interior of the domain of F is (of course) that Χ' + tV will 
then be in the domain for all sufficiently small ¢ € R. 

We can (and shall) discuss 


(4.27) (D,F)(X) = lim t-'[ F(X + tV) — FC(X)] 
t—0 
for all vectors V in R”. It is easy to see that if (D,F)(Y) exists, then 
(Ρ, ΕἸ ΧῚ exists for every c € R and 
(Dy ΕἸ ΧῚ) = c(DyF)X). 
If V ~ 0, the unique unit vector which has the same direction as does V 
is |V|~!V and thus (if it exists) the derivative of F at X in the direction of 
V is 
Ι 
--  (νΕΧΧῚ). 
If they exist, the derivatives of F in the directions of the standard 


basis vectors are called the partial derivatives of F. We write D,F rather 
than the cumbersome D,,F. Thus 


(D,F)(X) = lim t-*[F(x,,...,x; +t)... 5%) 
t-0 


mae Ce ee 


At times, we shall also use the notation 


A function (or map) of class ( is a function F such that 


(i) the domain of F is an open subset of R’; 
(ii) the partial derivatives D,F,..., D,F exist and are continuous 
functions. 


We call F a function of class C* (k > 2) if the partial derivatives D,F,..., 
ἢ, are functions of class C*~!. If F is of class C* for every k = 1, 2, 3, 
..., then we say that F is of class C~. 


Theorem 10. Let F be a map of class C' on the open set U in Ἐπ. If 
X € U and V ε Ἐπ, then the derivative (DyF)(X) exists and 


(4.28) (DyF)(X) = v,(DiF)(X) + - τ - + ν} (Ὁ, Ε)Ω). 


Proof. The map F is given by an m-tuple of real-valued functions on 
U, F=(fi,...,fm). It should be clear that 


DF = (D,fi,.--, δι.) 


and that (therefore) we need only prove this theorem for real-valued 
functions. 
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f 
U—>R 
be a function of class C! on U. Let X € U and V © R’. Thereisad > 0 


such that (Y + tV) € U, |t| <6. For any ¢ with |t| < ὃ consider the 
difference quotient 


f(X -- 19) -- FOX) 1χ,-Ὁ My, Xn + tn) — (αι... Xn), 
t t 


We rewrite that difference quotient by adding and then subtracting a few 
terms. Let us write it out fully in the case n = 3: 


f(XAW)—F(X) _ fr tty, x2 4103, X3+t03) —f(X1, X2 +102, X3 +103) 
t t 
(4.29) 4. Li, Xa 10), X3 +103) —f(%1, X2, ΧΑ +13) 
t 
en T(x, X2,%3 Ἔ 204) —f(X1, X25 X3). 
ΐ 


Fix 1. Then P, = (x1, χ; + 102, x; + 103) is a point in the open ball 
BCX; 6|V|) as is O, = X + tV. We apply the mean value theorem to the 
function which f defines along the line segment from P, to Q,, 1.e., we 
apply it to the function 


g(x) = f(P, + xE;). 


Since D, f exists on U, g is a differentiable function on the interval between 
0 and tv,. The mean value theorem tells us that 


8 ΠΝ g(O) _ 2'(0) 


where @ is between 0 and tv,. Thus 
— f(P 
f(Q,) ; t( ἢ) om νι, fyX,) 
where X, is some point between P, and Q,. Now apply the same sort of 


reasoning to the second and third terms in (4.29) and it should be clear 
that 


43) LAP) TIO) @ oD, AX) + rDrf MY) + 0D SZ) 


where X,, Y,, and Z, are points in the ball BCX; |r||V)). 
As 1 goes to 0, the points X,, Y,, Z, converge to X. Since the deriva- 
tives D,f are continuous, we obtain from (4.30): 
(Dy f)(X) = 0(Di f MX) + v(DrafMX) + 03(Ds fA). 


If fis a real-valued function of class C', then the gradient of f is the 
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(continuous) vector-valued function 


grad f= (δι... D,f). 
In this case, (4.28) becomes 


Dy f = <V, grad f> 
Of ack tty OF 


= ῆ),. -ς.- . 
"Ox, Ox, 
In other words the derivative of f at X in the direction of the vector V is 
the inner product of V with the gradient of f at_Y. 
This notation for the gradient is a bit awkward. At times it is more 
convenient to employ the notation 


7΄ = grad αὶ 
The reformulation of (4.28) then becomes 


(Dy AMX) = <V, F(X) 
In employing this notation, remember that f’ is a vector-valued function. 
We have not yet said anything about higher order derivatives in R’. 
If fis of class C?, then D,f,..., D, f are of class C' and so we have the 
second order derivatives 


| _ of : ; 
a aa ΡΥ ΤΩ, l<i<n, l<j<n. 


In case i = 7) it is customary to write 
02: 
D,D,f = "εἰ . 
One basic fact here is that, if fis of class C?, then D,D, f = D,D,f, that is 
(4.31) δ᾽ δ᾽ 


Ox,0xX;  OXx;OX; 


We shall defer the proof for a couple of pages, because it comes easily out 
of the basic observations on integration which we are about to make. 
From (4.31) we see that, if fis of class C*, each kth partial derivative of f 
has the form 
; ie δὲ 
Di +++ ΘΓ = sano 
where k, + --- +k, =k. 

Let f be a real-valued function defined on an open set U in R”. We 
say that fhas a local maximum at the point X in U if there is a d > 0 such 
that f(X) > f(T) for all T such that |X — Τ| -Ξ δ. In short, f has a 
local maximum at X if there is a neighborhood of X on which the value of 
J is never greater than f(X). The term “local minimum” is defined similarly. 
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Theorem 11, Let f be a real-valued function of class C' on an open set 
U in ἈΠ. If f has a local maximum at the point X in U, then f'(X) = 0, that 
is, (Df)(X) = 0,1<j<n. 


Proof. Exercise. 


Let us review very quickly the definition of the integral of a (contin- 
uous) function over a closed box in R". We shall be brief because the 
reasoning involved is essentially the same as for integrals over closed 
intervals. 

A closed box in R’ is a Cartesian product of 1 closed intervals 

B=1,x---xI, 
= {X;x,E€1,,1<k <n}. 
If I, = [a,, δ,]. then 
B=[{X34a,< xy, <b, 1 33 Καὶ <n}. 

A partition of such a closed box B is a set P= {B,,..., By} of 
(closed) boxes B, such that 

(i) P= Δ, LJ ... ι,) Βη: 

(ii) if i A 7 the intersection of the interior of B, with the interior of 
B, is empty. 

The mesh of such a partition P is 
|| P|| = max diam (B,). 
k 


The partition P’ = {B,..., By} is a refinement of the partition P = 
{B,,..., By} if each box B, is contained in one of the boxes B,. 

We should remark that when n = | the last three definitions reduce 
to the definitions we have already given in relation to partitions of an 
interval. Let us also note the following. One way to construct a partition 
of the box 


(4.32) B= 7, Χο Χ "ἐδ I, — [a,, δ.] 


is by choosing partitions of each of the intervals J,,..., J, and using the 
associated Cartesian products of all the subintervals. For instance suppose 
that n = 2, I = [a, δ]. J = [c, 4] and 

B=IxJ={, y)ax<x<bece<y< dh. 


Let P = (%,...,%,) be a partition of [a, b] and Q = [¥%>,..., y,] be a 
partition of [c, d]. The product boxes 


L, xX p= {(x, y)3 Xe ΞΟ ΧΞΩ Χο, Wee 332} < ye} 


partition the box B. There are np such boxes since k ranges from 1 to n 
and ¢ from 1 to p. The reasoning by which one obtains a partition of the 
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box B (4.32) from partitions of the intervals J, is similar; it is notationally 
a bit more complicated when n > 2. 

It is not true that every partition of B is obtained from partitions of 
I,,..., 1, ἴῃ the manner just described. But it is true that every partition 
has a refinement which is of this special type. 

In order to define integrals in n-dimensions, we need to use the n- 
dimensional volume of a box, i.e., that number which extends the idea of 
length in R', area in R2, volume in R?. We shall use the term “measure”, 
rather than “n-dimensional volume”. Thus, the measure of the box B = 
ΠΧ. xX I,is 


m(B) = I length (1. 


Now let B be a closed box in R’ and let F be a function 


F 
B—-»> R". 
If P = {B,,..., By} is any partition of B and if T = (7T,,..., Ty) is any 
choice of points T, € B,, define 
N 
(4.33) S(F, P, Τὴ = Py m(B,)F(T;,). 
The function F is called Riemann-integrable on the box B if the limit 


lim S(F, P, T) 


Pio 


exists. For each 6 > 0 define £, (F) = {S(F, P, T); || P|| < 6}. Then F 
is Riemann-integrable if and only if 


lim diam 2, (F) = 0. 
6-0 


When F is integrable, the integral of F over B is 
{ F=()\3,(). 
B 6>0 
Other notations employed for this integral are 
Ϊ F(X) dX, F(X, 00+ 5X,) dx, ... dX. 
B B 


The latter notation comes from the notation δὴ F(x,,...,*,) Ax,...Ax, 
for the approximating sums. 

Each continuous function on a closed box. B is Riemann-integrable. 
The proof proceeds just as it did in the case n = 1. The essential step 
exploits the uniform continuity. 

As in the Corollary to Theorem 3, the integral of a continuous func- 
tion F satisfies 


| [ Ε-- SF, Ρ, ΤῊ < m(B)oF, 25) 
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for every partition P of mesh at most 6 (and for every choice Τ᾽ of points 
in the subboxes of the partition). From the result about continuous func- 
tions it follows that every piecewise continuous function on B is Riemann- 
integrable. The function F 
F 
B—+» κα 

is called piecewise-continuous if there is a partition P = {B,,..., By} of 
the box B such that F is uniformly continuous on the interior of each 
box B,. 

The basic properties of integration summarized in Theorem 4 follow 
by precisely the same argument as in the case n = 1. In the analogue of 
property (11) the role of (6 — a) is played by m(B): 


|, F| <m() sup | FOO 


The only technical lemma which must be verified to employ the arguments 
given when ἢ = | is this: If P = {B,,..., By}is a partition of the box 8, 
then 

m(B) = m(B,) + --+ + m(By). 
We also obtain the extension of this additivity property which is analogous 
to Theorem 5: If P = {B,,..., By} isa partition of B and 


B—» R’, 
then F is integrable if and only if F is integrable on each B,, and when F 
is integrable 


{ F=[ Fee +] F 
B Bi Bn 
If Fis an integrable function on the box B= IJ, x --- x I,, then the 


integral of F over B can be calculated as an iterated integral, i.e., we can 
calculate 


{ Pies ig χ χει... ἄχ, 
B 
by integrating with respect to the variables x,, one at a time. We will 


verify this only when F is continuous, because the more general result 
gets into technicalities which will take us too far afield. 


Theorem 12 (Iterated Integrals Theorem). Let F be a continuous func- 
tion on the closed box B. Suppose that : 


B=B, xX B, 
where B, and B, are closed boxes. Then 


[,®= J, {0 νοι ay 


- { [FRY ay| dX. 
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Proof. The hypothesis is that B, is a closed box in R”, where n, + n, 
= n and that 


B={X © R"3(%,...,%,) € B, and (%41,...5%,) © By}. 


The only thing which is at all non-trivial in the proof is to avoid becom- 
ing lost in a maze of notation and detail. The point of the proof is that 
mB, Χ B,) = m(B,)m(B,), for any boxes B,, B,. Let us write out the 


proof when n = 2: 
B = [a, δ] x [c, d]. 


Choose the partitions 
Pe (Kyi oo αΞε χορ Ξ χιὶ - -.- «- χ,τοῦ 
Ο -- ο».-...}}λ}Ὺ στε γο “τ: Sead". 
Let 8., = [x;:-1, x] X [»,-...}}}. Let 
5s = Σ m(B,,)F(X1, Y;) 


— 2 (x; pe Xi-1 MV; ΝΣ »-)ᾳι, y;) 


π ᾿ k 
= > (x; — X11) By (Vy — »,. 1), Jy): 
Let ὃ be the mesh of the partition {B,,; |1<i<n,1<j<k}. Then 
|| P|| < ὃ and ||Q|| < 6. The idea is that if 6 is small, 


k ἃ 
ΡΣ (u; — »),-)Ε(α,: y)= [ F(x;, y) dy 


and we can get an estimate of how close the approximation is which is 
uniform in x, (does not depend upon x;). For any fixed index i, the func- 
tion G; defined by GV) = F(x,, y) is a continuous function on the interval 
[c, d]. According to the Corollary of Theorem 3, 


4.34) [fF y) dy — Σ ΟΣ — 9y-)Fny)| < (ἃ -- 0G, 28). 
The modulus of continuity w(G,, δ) is defined by 
co(G,, 6) = sup {|G() — GAz)|; y,z € [ς, dl, ly — 21 <4} 
| = sup {| F(x,, y) — F(x, 2)|3 52 € [-, 4], 1» — Ζ] <9} 


If |y — z| <6, then the points (x,, y) and (x,, z) are not more than ὃ 


apart. Therefore, 
o(G;, δ) < oF, δ). 


By the inequality (4.34) we have 
n d - oR 
δ᾽ -- Σ: (, -- χα) [ F(Xis y) dy| < > (x; — x;-1)(d — c)a(G,, 20) 
< (d — c)ealF, 28) Yt (x, -- 1-1) 


= (d — cb — a)a(F, 20) 
= m(B)a(F, 26). 


(4.35) 
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Define 
G(x) = if F(x, y) dy, ax<x<b. 
Then G is a continuous function on [a, b] because 
d 
G(x) -- G1 =|" LF») — Fe ay 
= (d—¢) sup | Fix, y) — Fu, y)| 
If |x — t| <0, we have 
|G(x) — G(t)| < (d — cal F, δ). 


Hence, 
a(G, δ) < (ὦ — c)o(F, δ). 
Let 


S = Ως σι -- αι σοῦ. 
By the Corollary to Theorem 3, 
δ ~ 
| { σοὺ dx — 8|< ὦ — a)aG, 28) 


< (6 — ald — c)oF, 20) 


= m(B)w(F, 20) 
and (4.35) says that 


|S — S| < m(B)aAF, 26). 
Therefore, 
(4.36) | [΄ G@) dx -- 5, < 2m(B)co(F, 26). 
Let ὃ — 0. We have 
lim @(F, 26) = 0 
6-0 


limS= | F. 
B 


ὅ-»Ὁ 


From (4.36) it follows that 


| F= [ σοὺ ax 
= {f Fcx, ») dy| a 


Corollary. If F is a continuous function on the box 


B=I[, x --- XI,, Ι, = [a,, b,] 
then 


[Ἐ- ee f° F(x,,...,X,) GX,... dx, 


where the symbol on the right denotes the result of integrating F(x,,..., Xn) 
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first with respect to x,, then integrating the resulting function with respect 
to X,, etc. The same result holds if the ἢ integrations are carried out in any 
order. 


Theorem 13. If F is a function of class C2 on an open subset of R*, 
then D,D,F = D,D,F, that is 


OF  @F 
OX; 0X; ΟΧ, OX;’ I<isn, l<Sjcn. 


Proof. We need only prove the theorem for n = 2. Let 
B = [a, Ὁ] x [c, d] 


be a box inside the domain of F. The derivative D, D,F is continuous. 
By Theorem 12 and the fundamental theorem of calculus 


ἃ δ 
= 02Ε 
[. D,D,F = i | “ee dx, 


= [ (DF), x2) — (δα, x2)] dx, 
= F(b,d) — F(b,c) + F(a, c) — F(a, d). 


If we replace ὃ by any number x, a < x < δ, and replace d by any number 
y, C<y <d, we have a similar conclusion which, for convenience, we 
rewrite as 


Fix, y) = [ [ (DiDiF\e1 x2) dx: ἀκ, — Fx, c) — Fla, y) + Fla, ο) 


" ]: [ (D,D,F)(x,, X,) dx, dx, — F(x, c) — Fla, γ) + Fla, c). 
From this equation, calculate D,F. We have 
y 
(D,F)(x,y) = | (DiDiF x, x2) dx, — (DiF x, ὁ). 


If we differentiate this equation with respect to y, we obtain 
(D,D,F)\(x, y) = (D, DF )\(x, y). 


This step involves use of the fundamental theorem of calculus and the 
fact that D, D,F is continuous. 


Corollary. Let F be a map from an open subset of R® into R™ and let 
i, j be indices, 1 Ki<cn, 1 <j <n. Jf the partial derivative D,D,F exists 
and is continuous, then D,D,F also exists and 
D,D,F a D,D,F. 
Proof. In the proof of Theorem 13 we used the existence and con- 
tinuity of one of the mixed partial derivatives to show that the other 
exists and is the same. 
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Exercises 


1. Let F be a continuous function on an open set U in R". If 


| F=0 
B 


for every closed box B c U, then F = 0. 


2. Define 
F(X) = [x1 |? [x2 |!/2, X € Κα. 


(a) Show that the partial derivatives of fat the origin exist. 
(b) Find a vector V such that (D,f)(0) does not exist. 


3. True or false? If fis a function of class C! on R* — {0} and if the partial 
derivatives of f are uniformly continuous functions, then f can be extended to 
a function of class C! on Κα. 


4. Let f be a (real-valued) function of class C! on an open set U. If X ε U, 
show that (at X) the direction in which fis increasing most rapidly is the direc- 
tion of f’CX), the gradient of fat X. 


5. If B is a box, let X + B be the X-translate of B: 
X+B={X+ Y; Ye B}. 
Let f be a continuous function on R’ and let B be a fixed box in R”. Let 


g(X) = f. 
X+B 
Is it true that g is of class C!? 


6. If fis of class C! and X is a point in the domain of f, then (ὃν f)(X) is a linear 
function of V. 


7. (Mean Value Theorem) Recall that a set is convex provided it contains the 
line segment joining each pair of its points. Let U be a convex open set in κἂν 
and let f be a (real-valued) function of class C! on U. If X, Yare in U, show that 


IY) —-f(X) = <¥ — X, Ρ)»» 
where P is some point on the line segment between X and Y. Hint: Let 
eh =  ΤΧ Ὁ ΚΥ -- xX), Ot. 
8. If k = (k;,...,,) is an n-tuple of positive integers, let 
ki=k,!.---k,! 
Dk = Dk... Dk, 
Let f be a polynomial function on Κα. Show that 


PEI +09 Xa) =D EDN Oh τ." xb 


9. Let F be a continuous function on the box B and let K be a compact subset 
of B. Suppose you wanted to define 
{ F. 
K 
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Here is one thing to try. Let g be a continuous function on B such that 0 < ¢g < 1, 
g =1onK and g < 1 off K. Prove that 


lim [ “5: 
k B 
exists. Hint: Prove it first when F is a non-negative real-valued function. 


10. Let F, B, and K be as in Exercise 9. Here is another reasonable way to at- 
tempt to define the integral of Fover K. If P = {B,,..., By} isa partition of B, 
consider those boxes B, which intersect K. For each such k, choose a point 
T, Ε By © K. Form the sum 


Sx = b> F(T; )m(B;) 
where the sum is extended over those k’s such that B, meets K. Prove that 


lim Sx 
IIPIto 


exists. Hint: Prove that it exists by showing that, if || P|| is small, then 


Sx = lim [ gkF (the limit from Exercise 9). 
k B 


4.6. Riemann-Stieltjes Integration 


Strictly speaking, this section does not constitute part of a review of 
the usual calculus course. It deals with a type of integral which 15 slightly 
more general, but technically no more involved, than the integral in R! 
which we discussed in Section 4.2. One example of a more general integral 
will be 


i λα 


where f is a continuous real-valued function on the interval J and G 15 a 
(vector-valued) function of “bounded variation”. Such an integral is a 
limit of sums of the type 


ἢ 


Σ ft MG.) -- σ(α,.-.4}}. 


k=1 
Let us define carefully the class of functions G which we shall use. 
Suppose that 


G 
[a, δ] ----» R* 
is a (not necessarily continuous) map from [a, 6] into R*. If 
Ρ -- (χο..-... Xn) 
ἃ π αι ιτεις δ =D 


is a partition of [a, 6], we form the sum 


(4.37) V(P;G) = Σ 1G(xx) — Ges) 
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G(x) 


G(x,) 


G(x,) 


FIGURE 17 


This is the sum of the distances between the successive points G(x,). If G 
is 1: 1, it should provide a better approximation to the length of the image 
of G in R*, as the mesh of the partition tends to 0. (See Figure 17.) 

We say that the function G is of bounded variation if the sums V(P; G) 
are bounded, 1.6., if there is a constant M such that V(P; G) < M for all 
partitions P. If G is of bounded variation, the total variation of G is 


(4.38) VAG) = sup V (P; G). 


If G is a function of bounded variation on fa, δ] and if a<. x < b, 
then obviously 


(4.39) Vi(G) + Vi(G) = VG). 
The function V, defined by | 
(4.40) V(x) = VG), a<x<b 


is called the variation of G. It is an increasing function on [a, δ]. 
Before we finish the discussion of integrals, let us look at some 
examples. 


EXAMPLE 5. If G is continuously differentiable (of class (1), then G is 
of bounded variation: 


2d | G(x.) — σία,-.) = Σ | G'(t) dt| 
< pe (x, — X,-1) Sup |G’ | 
= (ὁ — a)sup|G’ |. 


The total variation of G is {”|G’(x)|dx, although that takes a little 
proving. 
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There is a natural generalization of the example just given. The func- 
tion G is said to satisfy a Lipschitz condition if there exists a constant Μ 
such that 

| G(x) — G(y)|< M|x — yl. 
Evidently, such a function is uniformly continuous and of bounded 
variation. 


EXAMPLE 6. Arc length is related to functions of bounded variation. 
A path in R* is a continuous map 


G 
[a, b] —> R* 
from some closed interval into R*. Paths can be pretty horrible, but not if 


G is of bounded variation. If that is the case, we call G a rectifiable path 
(or, path with finite length) and define 
length (G) = V2(G). 

The motivation for this definition is to be found in Figure 17. Rectifiability 
makes the image of G look like a curve, and as ¢ ranges over [a, δ] the 
point G(t) traces out that curve. In general the curve may cross itself and 
G(t) may double back and trace some segments of the curve several times. 
This is the reason for calling G the path rather than referring to the image 
of G as being the path. In the event that G is 1: 1, the length of the path 
(i.e., the variation of G) is what we define to be the length of the image 
curve G({a, δ]). 


EXAMPLE 7. Let g be an increasing (non-decreasing) real-valued func- 
tion. Then g is of bounded variation and 


V2(g) = g(d) — g(a). 


This is because any sum (4.37) telescopes and has the value g(b) — g(a). 
Similarly, any decreasing function is a function of bounded variation. 
Every real-valued function of bounded variation is the sum of an increasing 
function and a decreasing function: 


(4.41) g—V,+(g—V,). 


We remarked that the variation of g (4.40) is an increasing function; and, 
the fact that (g — V,) is a decreasing function simply says that 


g(t) — g(x) < Vilg), x<t. 


EXAMPLE 8. An example of a continuous function not of bounded 
variation is 


g(x) =x sin, 0<x<b 


g(0) = 0. 
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The graph of this function oscillates too much to have a finite length. See 
Figure 13 on p. 95. At the points 2/z, 2/32, 2/52, ...the function g has 
the values 2|π, —(2/32), 2/52, —(2/7z),.... Compute the distances be- 
tween the successive points and we see that 


2 ᾿ ] I 
4). πὸ πὸ δ 
Thus g is not of bounded variation. 


Now, suppose G is a function on [a, δ] and fis a numerical function 
on J = [a, Ὁ]. If G maps J into R*, take f real-valued. If G maps into C*, 
take f complex-valued. For a partition P = (x9,..,X,), choose points 
ἔς, € [x,_1, X,] and look at the sum 


(4.42) S(f, Ps Τ' 6) = Σ f(t Gx) — GO) 


If the limit 
lim δ, P, Τ, 6) 
ΡΙ-ὁ , 


exists, we denote it by [, f dG and say that the Riemann-Stieltjes integral 
[, fdG exists. If Σοῦ, G) is the set of all sums (4.42) obtained from 
partitions of mesh at most ὃ, the integral exists if and only if 

lim diam 2,(f/, G) = 0 


‘ and, when this condition is satisfied, we have 
{ fdG = ()§,. 
1 ὅ»0 

This integral, when it exists, is also written 


[ΤῸ «σου. 


Theorem 14. If f is continuous and G is of bounded variation, the 
Riemann-Stieltjes integral 


[46 = [ f(x) dG(x) 


exists. In fact, if P is a partition of mesh at most 6 any sum (4.38) satisfies 


{ f dG — S(f, P, T, 6) 
I 


< aff, 26)V2(G). 


Proof. The proof is virtually identical with the proof when G(x) = x, 
except that in the estimate (x, — x,_,) is replaced by | G(x,) — G(x;_1)|. 


The same sort of result is valid for the integral 


[ Fag =f F(x) dg(x) 
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where F is continuous from [a, δ] into R” [C”] and g is a real [complex]- 
valued function of bounded variation. 


Theorem 15 (Integration by Parts, Weak Form). If f and G are con- 
tinuous functions of bounded variation on [a, b], then 


(4.43) [Κῶ dG) = σοῦ ] - [στ «ἴω. 
Proof. Let P = (Xo,...,X,) be a partition of [a, 6]. Then 
F(X NG%) — σία, ..}} - Pen) — ας, σὰ...) 
= Κ(Κἀἕ χρσί(χ) — f(%n-1)G(% 4-1). 
Sum from k = 1,...,. Let the mesh of the partition go to 0. 


EXAMPLE 9. If G is a function of class C!, Riemann-Stieltjes integra- 
tion with respect to G does not introduce anything particularly new. In 
this case we have 


[Τὼ «σῷ = [ ΤῶσῸ) ax 


for each continuous function Κ This is apparent by looking at approximat- 
ing sums. If P = (x9,..., X,) 1S a partition of [a, Ὁ] and T = (t,,...,¢,) 
a choice of points ἐκ Ε [x,_,, x;,], then 


EA GNG) — 6-0 = Σ κὼ [Oat 
- [σῷ at 


where fp; is the step function which has the (constant) value f(t,) on the 
kth subinterval of the partition P: 


Ser(t) = f (te), Xp Sty. 
For a continuous /, fp, converges uniformly to fas || P|| — 0: 


lim || f — fer||. = 0 


IPII0 
Hence the Riemann-Stieltjes sums converge to [ OF (t) dt as the mesh 
of P goes to 0. 


EXAMPLE 10. Let 


fa=z py x #0 
f(0) = 0. 


On any interval [---ὦ, δ], fis of bounded variation. If G is continuous, then 


(4.44) [΄ σοὺ afl) = GO). 
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This has nothing to do with f(0). We could have defined f(0) to be any- 
thing. Just use partitions which do not involve 0. The “derivative” of f 
is what Dirac called the “delta function”. Of course, f is not differentiable 
at 0. If f had a derivative, it would satisfy 


[COS dx = GO) 


for all continuous G. Now f’(x) = 0 for x 4 0. But [΄, f' = 1. So f’(0) 


= oo. So, there is no such function f’. The Dirac delta “function” is the 
process of integration against df. | 


EXAMPLE 11]. Look at the Cantor function from Example 24 of Chap- 
ter 3. It is a continuous function f on the interval [0, 1] such that 


(i) f is increasing (non-decreasing); 
(ii) f(0) = 0; fl) = 1; 
(iii) f is constant on each of the intervals which were deleted from 
[0,1] to form the Cantor set. 


Thus any integral | G df depends only upon the behavior of G on the 
Cantor set. But, the dependence is much more involved than in Example 9. 


Suppose that we fix a function G 


G 
[a, δ] ΤΙΣ R™ 
which is of bounded variation. For each continuous real-valued function 


f on the interval J = [a, b] the Riemann-Stieltjes integral Ϊ 7 4Ο exists 
and has these properties 


ὦ {01 1 5) 46 -- ο[ 7646 -Ὁ [ gac 
Gi) [46] -- VXG) sup |f| 


There is a particular form of converse to this which is useful in several 
parts of mathematics. Suppose we have a function L which assigns to 
each continuous function f on J a vector L(f) in R”, and suppose that L 
has these two properties: 


(i) L is linear, 1.6., L(cef + g) = οἱ ἢ + L(g); 
(ii) there is a constant M> 0 such that |L(f)|< M sup | f | for 
every f continuous on J. 


Then there exists a function G from 7 into R” which is of bounded varia- 
tion such that 


L(f) = { fac 
I 
for every Καὶ We shall not stop to prove this here. 
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The reader has probably noticed a pattern to the various Riemann- 
Stieltjes integrals which we have discussed, and has probably observed 
that we could introduce other integrals by the same method. For instance, 
if F and G are k Χ k matrix-valued functions, if F is continuous and G 
is of bounded variation, the matrix 


| Fac 


is defined. (Be careful; it’s not the same as Ϊ (dG)F.) We don’t want to 
make a fuss about all that; however, let us say this much. We can define 
the integral | F dG just as we did, using a variety of different multiplica- 


tions, that is, using a variety of different interpretations of the products 
in the sums 


FONG) — σοι.) 
Suppose 


F 
[a, δ] -- Κα 


σ 
[a, b] —~> Κ΄. 
and suppose we are given any “multiplication” of vectors in R” by vectors 
in R*: 
M 
R™ x R* — > R’. 
Assume that 


(i) M is bilinear; 1.6., MCX, Y) is a linear function of X for fixed Y 
and a linear function of Y for fixed X. 
(ii) | M(X, Y)| < |X| YI. 


If F is continuous and G is of bounded variation, the integral 
5. 5 
(4.45) [ M(F(%), dGx)) = {πὶ Sy MFC), Gx) — σα.) 


exists with the proof which we gave. The integral [ M(F, dG) is a bilinear 
function of F and G and 


(4.46) | [ Mor, 40) < {δ αἵ. 


Exercises 


1. If G is of bounded variation on [a, δ], then G has left- and right-hand limits 
at every point of [a, δ]. (Prove first for real-valued functions.) | 


2. If G is of bounded variation, then G is continuous except at a countable 
number of points. 
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3. If G is of bounded variation, then { f dG is independent of the values of G 
at the points of discontinuity interior to [a, ὁ]. 
4. If fis a function of class C! on [a, 6], then 
{ G df = Gf’. 
5. True or false? If fis a continuous function of bounded variation on [a, ὁ] 
and if f(a) = 0, then 
fo) = |" df. 


6. Show that, if Ο is a refinement of P, then 
V(Q;G) < V(P; G) 
for all functions G on [a, ὁ]. 
7. Show that, if G is continuous and of bounded variation, then 
lim ΚΡ; G) = V°(G). 


[Pll-o 
8. True or false? Every convex function on [a, ὁ] is of bounded variation. 
9. True or false? If f 


f 
[a, δ] ἘΞ Σ [c, 4 


is ἃ 1:1 function of bounded variation, then the inverse function f-! is of 
bounded variation. 


10. Let f be a differentiable function on the interval [a, 6]. Prove that, if f’ is of 
bounded variation, then f’ is continuous. 


11. Let g be the function on [0, 1] defined as follows: If x is irrational or zero, 
g(x) = 0. If x # Ois rational and x = p/g (lowest terms), then g(x) = g~3. Then 


g is of bounded variation. What is { fdz? 


12. If fis of bounded variation and G is both continuous and of bounded varia- 
tion, the Riemann Stieltjes integral | FS dG exists. 


13. Let g be an increasing function on [a, ὁ] such that 
δ 
| dg = |. 


If F is continuous from [a, Ὁ] into R”, then { F dg lies in the closed convex hull 


of the range of F. (The closed convex hull of a set is the smallest closed convex 
set which contains that set.) 


*14. (Mean-value theorem) Let F, g be as in Exercise 13. There exist points 
t1,...,¢, in [a, Ὁ] and numbers ¢c;,...,c, with 


| Fae =F) +++ +eF), c7 20, Ye =1. 
J 


15. (Jensen) Let g be an increasing function on [a, δ] such that g(6) — g(a) = 1, 
and let ᾧ be a continuous convex function on the real line. If fis any continuous 
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real-valued function on [a, bj], then 


[Go Nde=9l| Fae). 


(Hint: Let t = [ f dg. There is a line through (¢, φ()) such that the graph of ᾧ 
is above the line. Write that down and integrate.) 


16. According to Exercise 15 


| ef dg > exp ([ f dg) 


under certain conditions. Use this to prove that, for positive numbers a,,... , Qns 
the arithmetic mean exceeds the geometric mean: 


Ja + Ὁ ἀρ >a a) 


(Hint: Pick n points t;,...,¢,; let g jump by 1/n at each ἐκ.) 


9. Sequences 


of Functions 


5.1. Convergence 


Suppose that we have a sequence of functions F,, with a common 
domain of definition, D. We have in mind that D may be any set whatso- 
ever. We want to talk about what it means for the sequence {F,} to con- 
verge. Basically, this will mean that the values of the functions converge. 
Therefore, we had better have all the functions map into the same (Eucli- 
dean) space: 


ἔμ 
(5.1 ps Re 


If we fix any point X in ἢ, then {F,(X)} is a sequence of points in R”, and 
we can ask whether or not that sequence converges. 


Definition. The sequence of functions {F,} (5.1) converges pointwise if, 
for each X in D, the sequence {F,(X)} converges. If {F,,} converges pointwise 
and if 


F(X) = lim F,(X), XeD 


then (F is a function on D and) we say that {F,} converges pointwise to (the 
function) F and we write 


F = lim F,. 


The reader is familiar with the convergence of various sequences of 
functions, e.g., 
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fix) =x", xeR, 


lim f, = 1; 


F(A) = Σ AK, JAI <1 
k=0 
lim F(A) = (I — 4)". 


What is a vector Y = (y,,...,y,,) in Κα Ὁ It is a real-valued function on 
the set {1,2,..., m}: 


Y(k) = yy. 
So, a sequence of vectors Y, in R™ is a sequence of functions Y, = 
(Vato +++ Ynam) With common domain 


D={1,2,..., mb} 


and, these functions converge pointwise if and only if the sequence of 
vectors {Y,} converges in R”. 

Pointwise convergence of functions seems so natural that one might 
wonder why we bother to formalize it. We do so not only because it is a 
fundamental idea but also because we wish to emphasize its limitations. We 
would like to be able to transfer to F = lim, F, properties of the functions 
Ε,: for instance, we would like to know that F is continuous if each F, 15 
continuous. And we would like to establish the continuity of some opera- 
tions (interchange the order of two limits), for example, 


[ F=lim | F,. 
Pointwise convergenee usually is not strong enough to give positive results 
in these directions. Here is a stronger form of convergence. 


Definition. The sequence of functions {F,} converges uniformly to (the 
function) F if, for each € > 0, there exists a positive integer N such that 


| F(X) — F,(X4)| < ε, n>N,X e€ Ὁ. 


In other words, F,, converges uniformly to F, provided (for large 7) 
F(X) is near F(X), where “near” means uniformly near for all X in the 
domain. To rephrase, pointwise convergence means that, given ε > 0 and 
given X, there exists a positive integer Ny, such that 


| F(X) — F,(X)| < ε. n> Nyx... 
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The convergence is uniform if and only if the integers N x,- can be chosen 
so as not to depend on X, 1.e., if and only if {Ny .; X¥ € D}is bounded (for 
each € > 0). 

A convenient notation for uniform convergence is provided by the 
sup norm: 

F 
(5.2) ἢ--- Κα 
|| Fl. = sup {| F(X); Χ © 8}. 


We will explain in Chapter 6 the reason for the use of the subscript “oo” in 
this notation. At times we need to discuss the supremum of | F| on a subset 
K of D and we will write 


sup | F| = sup {| F(X); χε K}. 
K 


In this notation 
|Fll. = sup | Fl, 


If {F,,} is a sequence with a common domain of definition, we say the 
sequence is bounded if the sequence of numbers {]| F,||..} is bounded. 
Plainly, F, converges uniformly to F if and only if 


lim || F — F,||.. = 0. 


In order to understand uniform convergence, one should have clearly in 
mind some picture of the €-ball about F which is defined by the sup norm: 
{G; [Ὁ — σ μ < ε}. Suppose fis a real-valued function on (a subset of) 
the real line. The “uniform €-ball about f” consists of the functions g for 
which the graph stays in a band about the graph of f. (See Figure 18.) 


FIGURE 18 


EXAMPLE |. This is an elementary but important example. Suppose f, 
is defined on the real line by 
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95 
92 


FIGURE 19 
nx, O<x< ἘΞ 


F(x) = n—m(x——), χες. 
n n n 


0, otherwise. 


In other words f, is a tent function which rises linearly from 0 to on the 
interval [0, 1/n], falls linearly from n to 0 on [1/n, 2/n], and is 0 elsewhere. 
The sequence { f,} converges pointwise to 0; however, the convergence is 
not uniform because 


[0 -- Fa llos = WN Fa [lee 
does not even stay bounded, much less go to 0. But, the lack of bounded- 
ness is not the essential element in the lack of uniformity. The tent 
functions g, = (1/n)f, constitute a bounded sequence which converges 
pointwise to 0; and yet || g,||.. = 1 for all n. See Figure 19. 


EXAMPLE 2. Let h, be the real-valued function on the real line 


0, x<in 
h(x) =4x—n, n<x<in+!1 
1, x>n+1. 


Then {h,} is bounded; h, >h, >h,>-- .; h, converges pointwise to 0. 
But, the convergence is not uniform. 


EXAMPLE 3. Let’s look at the convergence of the polynomials 
pix)= Si txt, xeR 
k=0 k! 
to the exponential function. Is that convergence uniform? Well, 


(5.3) [65 -- p< Σ A Ixkt 


n+l 
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On any interval [---ὦ, δ], the convergence is uniform because 
sup exp — pol < Sidbh 
[—b, 6] ari k! 
On R, the convergence is not uniform, because (for each 7) 
lim |e* — p,(x)| = οο. 


For many things we wish to do, uniform convergence on compact subsets 
will suffice. 


EXAMPLE 4. Let F be a continuous function from the interval [0, 1] 
into Κα. If n is a positive integer, we partition [0, 1] into subintervals of 
equal length using the points 


0, £24 π-ῖ|} 


nen n 


Ρ(: Sate 


Define 


F(t) = 


Then F, is a “step function” which has the constant value F(1/n) on the 
interval [0, 1/n], the constant value F(2/n) on the interval (1/n, 2/n], and so 
on. We assert that the sequence {F,} converges uniformly to F. Let ε > 0. 
Since F is continuous on [0, 1], F is ΒΠΠΟΙΙ γ οὶ continuous. Hence there 
exists ὃ > 0 such that 

| F(x) — F)| < ε, lx —t| - δ. 
Let N be any positive integer such that 1/N < 6. Then 

||F — F, ||. < €, n>QN. 


Why ? Let ¢ be any point of [0, 1] and n be any integer greater than N. If 
t 0, then ¢ belongs to one of the intervals [(k — 1)/n, k/n] and F(t) = 
F(k/n). Thus 


IF) — F,(t)| = Ῥω Ξ F(4) 


and | F(t) — F(k/n)| < ε, because [1 — (k/n)| < 1/n < 6. So we have 
| F(a) — (Ὁ! < e, alln=>N _ ~ and all ¢t ε (0, 1]. 


Lemma. Let {F,,} be a sequence of functions from the set D into Ἀπ. 


(i) {F,} converges pointwise to some function F from D into R®™ if and 
only if {F,(X)} is a Cauchy sequence for each X in D. 
(ii) {F,} converges uniformly to some function F from D into R® if and 
only if 
lim || Εἰ — Fy ||. = 0 
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Proof. In each case it is the “if” half of the result which requires com- 
ment. 


(i) Let X € D. If {F,(X)} is a Cauchy sequence, the completeness of 
R tells us that {F,(X)} converges to a vector in R”. Call this vector F(X). 
If the Cauchy condition holds for each X, then F associates with each X in 
D a vector 
F(X) = lim F(X) 


in Κα 
(ii) The condition 
lim ||F, — F, ||. = 0 
k,n 
states that, given ε > 0, there is a positive integer N such that 
|F(X) — F,(X)| <e, for all k,n > N, and all X ε Ὁ. 


In other words {F,(X)} is Cauchy and converges at a rate which is indepen- 
dent of X. If we let F(Y) = lim F,(X), then 


| F(X) — F(X)| -Ξ ε, for all k > N, and all X ε D. 


That is, 
|\P— Filles 6, “KN, 


EXAMPLE 5. Weierstrass noted long ago a basic (but special) case in 
which an infinite series of functions 


Xf. 

converges uniformly. Suppose that 
Σ Il Falle <0. 
Then look at the partial sums 
S,= Σ᾽ Γι. 
k=1 
We have 
IS, — Selle < DU Fille, <n 

and by the last lemma {S,,} converges uniformly to δ᾽: 


S(X) = Σ F(X). 


There is one important set of circumstances in which pointwise con- 
vergence implies uniform convergence. 


Sec. 5.1 Convergence 


Theorem 1 (Dini). Let D be a compact set in R*, and let {f,} be a 
sequence of continuous real-valued functions on D which is monotone-decreas- 
ing: 

fe tS Te Oe ent Se 
If f, converges pointwise to 0, then f,, converges uniformly to 0. 
Proof. Let ε > 0. Let 
U,={X ε D3 f,(X) < ε). 

Since f, is continuous, U, is open relative to D. Note also that, since f,, > 
ἔχειν. We have 

UG Uy GU Se 2%: 
Now, {U,} is a cover of ἢ. Why? If X is in D, then f,(X) converges to 0; 
in particular, f,(X) < ε for somen,i.e., X isin U,. Since D is compact and 
the U,’s are increasing, some U,, covers D, i.e., for some N, U, = D. We 
then have 

0-, - ε, n> Ν. 


Corollary. Let {f,} be a sequence of non-negative continuous functions on 
a compact set Ὁ. If 


>) £,(X) < co 
for each X εἝ D and if the function 
(5.4) f= Sof, 


is continuous, then the series (5.4) converges uniformly. 


It would be difficult to emphasize too often how crucial the hypotheses 
of monotone convergence and compact domain are in Dini’s theorem. In 
Example 1, the continuous functions f, converge pointwise to 0 as do the 
continuous functions g,, but in neither case is the convergence uniform. In 
Example 2 the convergence to 0 is monotone but the domain is not com- 
pact and the convergence is not uniform. In the corollary, the monotonicity 


of the sequence of partial sums s, = SB ἔκ 15 ensured by the condition 
k=1 
fi, = 9. The a priori knowledge that f = lim s, is continuous then guaran- 


tees that Dini’s theorem applies to {f — s,}. 


EXAMPLE 9. Let 
7) = x}, O=< x=, I; 


Here is one way to construct a sequence of polynomials which converges 
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uniformly to Καὶ Define 
fo = 0 
frviQ) = f(x) + $lx —f,(x)?], n>. 


Now we verify that 


(5.5) 


(i) f,, is a polynomial function; 

Gi)0O<f<f 
Obviously, f,, , is a polynomial function if f, is. We verify (ii) by mathema- 
tical induction. Certainly (ii) holds for n = 0. Suppose it holds for n = k. 
It is then clear from (5.5) that 


Sri Zhe = 9. 
Now 
Fev =f + Hf? = fi) 
Ξε + $F +i MS — Sx) 
She ῷ — Se) 


and (since we are on the compact interval [0, 1]) f/< 1. Thus, 


“πὰ εξ 
=f. | 
We conclude that { f,} is a monotone increasing sequence of continuous 
functions on (0, 1], bounded above by /: 
0<f, <Sir SS 

Thus { f,} converges pointwise. If g is the limit function, then (5.5) shows us 
that 

g=g+4(f? — g?). 
Thus g = f. Since fis continuous, { f — f,} is a sequence of continuous func- 


tions on [0, 1] which converges pointwise monotonely down to 0. By 
Dini’s theorem, the convergence is uniform. 


Exercises 


1. Let D be a non-empty set and let f be a complex-valued function on D such 
that 
| fCX)| < 1, for each χε D. 


Show that the sequence of powers { f,} converges pointwise to the zero function 
on D. Prove that 3,5 converges uniformly to 0 if and only if 


sup | f| <1. 
D 


2. Let D be the set of positive integers and let { f,} be the sequence of functions 
on D defined by 
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n—k 
5S agama 


Show that { f,} converges pointwise to (the constant function) 1. Is this conver- 
gence uniform? 


3. Construct a sequence { f,} of continuous functions on the real line R such 
that 


(i) {f,} converges pointwise to 0; 
(ii) none of the functions f, is bounded. 


4. True or false? If {F,,} is a sequence of functions which converges uniformly, 
then {F,} is a bounded sequence. | 


5. For which x € R does lim, cos nx exist? 
6. Let 
f(x) = { ; οἷαι ἢ, xe ΑΚ. 


The sequence { f,} converges uniformly to 0 on the real line. 


7. Let f be the complex-valued function defined on the open unit disk D = 
{z € C;|z| < 1} by 


f(z) Ξ  ( — 2) !, ze D. 
Let 


fz) = σι. 
k=0 


Then (as we know) f, converges pointwise to f. Show that this convergence is 
not uniform on D. Then prove that it is uniform on each compact subset of D, 
Le., if K < Dis compact, 


lim sup | f — f,| = 0. 
n K 


8. Let {7,,} be a sequence of linear transformations from R* into R”. If {T7,} 
converges pointwise to 7, then 


(i) T is a linear transformation; 
(1) the convergence is uniform on each compact subset of Ré. 


9. Prove Dini’s theorem for functions on closed-bounded sets, using the 
Bolzano-Weierstrass theorem rather than the Heine-Borel theorem. 


10. Let F,, F be functions of bounded variation on the interval [a, b]. Suppose 
that V2(F — F,) converges to 0 and F,(a) converges to F(a). Prove that {F,} 
converges uniformly to F. Why did we have to make the assumption about 
F,(a) and F(a)? 


11. Find a sequence of polynomials f, which converges uniformly to 0 on [a, 5] 
and for which 
lim V2(f,) = ©. 


12. Let {p,} be a sequence of polynomial functions on the real line. Suppose that 


(i) {p,} converges pointwise to a function /; 
(ii) the degrees deg (p,) are bounded. 
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Prove that fis a polynomial function. Hint: Try to show that the coefficients of 
the various powers of x involved in p, are converging. 


13. (Abel’s lemma) Let Σ V, be a series of vectors in R” such that the partial 
sums of the series lie in a convex set K. Let οἱ > c, >--- >0Obea decreasing 
sequence of non-negative real numbers. Show that the partial sums of the series 
> c.V,, lie in the set c,K. 


14. (Abel’s theorem) Let {F,,} be a sequence of functions from a set D into R”, 
and let p; > p, > --- > 0 be a decreasing sequence of non-negative functions 
on D. Suppose that the series >) F, converges uniformly and that the sequence 
{p,} is bounded (i.e., p; is bounded). Use Abel’s lemma (Exercise 13) to show 
that the series δὴ p,F,, converges uniformly. 


15. (Dirichlet-Hardy) Let {F,,} be a sequence of functions from a set D into 
Κι and let p; > ΡΣ > --- > 0 be a decreasing sequence of non-negative func- 
tions on ὦ. If the partial sums of the series δὴ F, are bounded (in sup norm) 
and if the sequence {p,} converges uniformly to 0, then the series >) p,F,, con- 
verges uniformly. 


5.2. Calculus and Convergence 


Now let us take up the questions which we cited as a motivation for 
the definition of uniform convergence. 


Theorem 2. If {F,,} is a sequence of continuous functions 


Fy 
D —> R®, Dc RK 
and if F,, converges uniformly to F, then F is continuous. 


Proof. The idea of the proof is this. We have 
+ F(X) — F(Xo) 


for each ἡ. For large n, F,(X) is near F(X), F,(X,) is near F(X,), and (since 
F,, is continuous) F(X) will be near F,(X,) if | X — Χο] 1s small. We have 
to watch our “nears”. 

Let X, € ὃ. Let ε > 0. Choose one particular positive integer N so 


that 
IF — Εν!» <> 
Then (5.6) gives us 


2 
(5.7) | F(X) — F(X0)| < + | Fo(X) — Fru(Xo) | 
Since Fy is continuous at Χο there exists 6 > 0 such that 


|Fx(X) — F(X) <> |¥ — Χο] «ὃ. 
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(This 6 depends upon ε, X,, and N.) Thus from (5.7) 
| F(X) — F(X,)| < ε, |X — X,| < δ. 
Corollary. Let 
D ae R™. Dc Εἴ 
and let X, be a cluster point of D. Suppose 
(1) each F, has a limit at X,; 
(ii) F,, converges uniformly to F. 


Then F has a limit at X, and 

lim F(X) = lim lim F,(X). 

ΧΟ ΧΟ π X--+Xo 

Proof. The reader should be able to see that this is a corollary of the 

previous theorem. It may prove more profitable to go through the proof of 
the theorem again, observing that the corollary has substantially the same 
proof: If L, is the limit of F, at Χο» then |Z, — L,|< || F, — F,||..3 thus 
L, converges to a point Z in R”. Then 


JL— FUX)| <2 —L,| + | L, — FCO] + FCO — F(X) 


and argue as before. 


The corollary states that uniform convergence allows us to inter- 
change the order of two limits (in certain cases). In this section we repeat 
that fact again and again. The various results are illustrations of this one 
principle. 

As a first application of the principle, suppose we have a double 
sequence of points in R”: 


Xnk> nok = 1,2, 3,04. 


For descriptive language, think of {X,,} as an infinite matrix 


Xyy X12 X13 
X21 X22 X43 


X31 X32 X33 


One problem in the interchange of order of limits is this. Suppose each 
row converges and each column converges: | 


A, = lim X,, exists for each n 
k 
B, = lim X,, exists for each k. | 


When can we conclude that lim, A, and lim, B, exist and are equal? 
Evidently, that conclusion is false in general as the following triangular 
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matrix shows: 


1 0 0 0 
I 1 0 0 
I 1 1 0 
11] 


But, the conclusion is valid if either the row or column convergence is 
uniform (over the columns or rows). 


Theorem 3 (Moore-Osgood Double Limit Theorem). Let {X,,} be a 
double sequence of vectors in R™. Suppose that 
(i) for each n the nth row converges 


A, = lim X,, 
k 


(i1) for each k the kth column converges 
B, — lim ban 


(1) either the row or column convergence is uniform (in n or k). 
Then lim A,, exists, lim B, exists and these limits are equal. Furthermore, 
n k 


if X is their common value, the double sequence {X,,,} converges to X: 
lim |X — X,,| = 0. 
n,k 


Proof. Suppose that row limits A, exist and that the column limits B, 
exist, uniformly in k. The latter condition means that, for each € > 0, 
there exists an N such that 


|B, — Xin. | < €, for all n > N and for all k. 


The functions 
Ε, 
Ζ,--» R™ 
ΕΚ) = Xn 
converge uniformly to the function F, F(k) = B,. And each F, has a limit 


at co 
lim F(k) = A. 
k- oo 


Thus we are in the situation of the last corollary, except that the “cluster” 
point is the point at co. The reader may wish to repeat the proof in the 
present context, to be certain that the fact that the limits are at oo changes 
nothing. We conclude that 


lim A, = lim B,. 
n k 
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If X is the common value of these two limits, then 
(A= Xgl Bgl Be Aine | 
Given € > 0, choose N so that| B, — X,,| < €/2 for allk, provided n > N. 
Then choose Καὶ so that | X — B,| < €/2 provided k > K. If M = max (K, 
N), then 
ΙΧ — ΧΙ < €, n,k > M. 


There are some further remarks we should make about the double 
limit theorem. It can be restated in terms of the Cauchy criterion for con- 
vergence; for example, if {X,,} is a Cauchy sequence for each fixed n and 
as n —» oo is Cauchy uniformly in k, then 

lim lim X,, = lim lim X,, = lim X,,. 
n k k n n,k 
A result may also be restated for double infinite series δ᾽ X,,. We have 
already dealt with the most important special case—absolute convergence. 
We proved that if 


Σ [Χ, ΚΙ < 0° 
then 
ΣΣ ΧΩ = EE Xu 
and hence that we can talk unambiguously about 


> Xue 
nik 


The reader should verify that this follows from the double limit theorem 
applied to the sequence 
n k 
Snk — xij 
i=1 75Ξ 
Absolute convergence ensures uniformity in the convergence of both the 
row and column sums of the matrix {S,,,}. 


Let us continue with our applications of uniform convergence and the 
interchange of limit operations. 


Theorem 4. Let {F,} be a sequence of continuous functions on a closed 
box B in R*. If F,, converges uniformly to F, then 


li Ξε ᾿ 
1m [. F, I, F 
Proof. Observe that 


mgt hs: 


| AP | 


<||F — F,|lom(B). 
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Thus the result seems trivial. It is not, however, because we need Theorem 
2 to ensure that F is continuous and therefore integrable. 


Theorem 5. Let | be an interval on the real line, and let {F,} be a se- 
quence of differentiable functions from | into R™. Suppose that 

(1) F, converges to F; 

(ii) the derivatives F’, converge uniformly. 


Then F is differentiable and F’ = lim Ε΄. 


Proof. Fix x € J. What we want to prove is that 


(5.8) lim lim Slt) — Sf) = lim lim ne Ξ' FQ), 


t-x n n tx 


This will follow, by the same reasoning we have been using if we can show 
that the limit 

n [=X =X 
exists uniformly in ¢. This is why we have assumed that the derivatives Ε΄, 
converge uniformly. We want 
(5.9) F At) — F(x) ΝΕ F(t) — F(x) 

t—x t—x 

to be small, uniformly for all t + x, provided k,n are sufficiently large. 
The argument is cleaner if we use coordinates. We need only verify the 
smallness of (5.9) for real-valued functions. In the real-valued case, the 
mean value theorem tells us that (5.9) is 


Cf "Ἢ fd) 
where c 15 between ¢ and x. Thus 


FAD -- κα) _ filt) -- flo) δος Νὴ 
oS «Ἐν = κα 


P= xX 
It should now be clear that (5.8) holds. 


Of course, the hypotheses of Theorem 5 imply that F, converges uni- 
formly to F. In fact, if one knows that {F’,} converges uniformly and that 
the values {F,(x)} at any one point converge, then it follows easily that 
{F,} converges uniformly. 

Theorem 5 tells us that we may differentiate under integral signs in 
some cases: 


Corollary. Let 1 be a closed interval and let B be a closed box in R*. 
Let F be a continuous function on 1 < B such that the partial derivative 
F(t, X) —3 F(x, X) 
t 


— X 


(D,F)(x, X) = lim 
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exists and is continuous onl Χ B. Then 
G(t) = Ϊ F(t, X) dX 
B 
is a differentiable function on I and 
στ -- [ (D, F(t, X) dX. 
B 


Proof. The result follows from Theorem 5 and basic properties of 
integration. The point is that the difference quotient converges uniformly 
to D,F as t tends to 0. Some people prefer to write the conclusion as 


which is fine, if one knows what it means. 


Exercises 


1. True or false? If { f,} is a sequence of uniformly continuous functions and 
if f, converges uniformly to f, then fis uniformly continuous. 


2. True or false? If F, converges uniformly to F and G, converges uniformly 
to G, then <F,,, G,> converges uniformly to <F, G>. 


3. Let f be a non-negative continuous function on the interval [0, 1]. Suppose 
that the sequence f!/" converges uniformly. How many zeros does f have? 


4. If fand g are continuous functions on the real line, let 


(f * g\(x) = f(x — dg(d) αἱ. 


True or false? If fis of class C~, then f* g is of class C~. 
5. Let 


. ΧΟ 
ξεν χΡ 1, Δ 1. 


True or false? 
lim f(x, y) = 1 
xl 

uniformly in y. True or false? 


.Φ 1 
lim y=-— 
ey 


uniformly in y. 
6. Prove that, if {a,,} is any double sequence of non-negative numbers and if 


co is allowed, then 
Σ (Σ κα) oa Σ (> Arn). 


7. Let {X,,} be the double sequence of complex numbers 


Xnk = eXp (—n + zi): 
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(a) Is the Moore-Osgood double limit theorem applicable to {X,,}? 
(b) Is the double series δὴ X,, absolutely convergent? 


8. If {g,} is a sequence of Riemann-integrable functions, if 


ποὺ = [διῶ dt 


and {g,} converges uniformly on [a, 6], then { f,} converges uniformly on [a, 6]. 


9. True or false? If fis continuous on D and || f||.. < 1, then the series 
x fe (f8 =1) 


converges uniformly to 1/(1 — f). 


10. Let F be a function of class C! on the real line. Prove that, as ¢ tends to 0, 


the difference quotient 
F(x + t) — F(x) 
t 


converges to F’(x), uniformly on compact subsets of R. 


*11. Suppose that, for each ¢t € [0, 1], we have a continuous function f, on D. 
Assume that the map from ὦ to f, is continuous, in the sense that, if 1, converges 
to ¢, then f,, converges uniformly to f,. Define and discuss 


[ fat. 


5.3. Real Power Series 


Beyond polynomials, the nicest real-valued functions on the real line 
are those which can be expanded in a power series: 


I(x) = > a,x". 


We have been working with some such functions: 


e* =y am, x ER 
: — = (--͵Ι᾿ 25:} 
δ Π ΠΡ Σ τς ΞΕ Di ; x ER 
cos x = Σ are x ER 
he yg Ix] <1. 


Definition. Let f be a complex-valued function on a subset of the real 
line. We call f a real-analytic function if 

(i) the domain of definition of f is an open set U; 

(ii) if x, ε U, there is some power series about Χο 
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24 a(x ar Χο)" 
which converges to f in a neighborhood of Χο. 


In other words, a real-analytic function is one which 1s locally the 
sum of a convergent power series. If we want to know about the local 
behavior of such functions, we need only study the behavior of a single 
convergent power series; and, we may as well center that series at the point 
x, = 0. 

So, suppose we have a sequence of numbers ay, a,, @,,.... They may 
be complex. Consider the formal power series 


(5.10) ΣΙ a,x". 


Does it converge for any x? Obviously it does for x = 0, but that’s not 
inspiring. Suppose it does converge for some t + 0. Then it is easy to see 
that it converges in the interval |x| < [1], for this reason. Since the series 
converges at x = f, we have 


lim a,t” = 0 


so that there exists an N such that 
|a,t”| <1, n> WN. 
Then, if |x| < |t|, we have 


Ex >= a,( =) 1 
yax=> (=) 


N οο 
(5.11) Tlayx"| = 2 [6.55] + 3 last 


xf 
ΐ 


Ν οο 
ΟΣ» 


Since |x/t| < 1, the series converges absolutely at each point of the open 
interval |x| < |t|; and, the convergence is uniform on each compact subset 
of that interval, that is, on each set {x;|x|< c|t|}, where 0 < c < 1. The 
uniformity is clear from (5.11). 

Now we see that attached to each formal power series (5.10) is a 
radius of convergence 


r = sup {|t]; So a,t” converges} 


and that radius of convergence possesses and is determined by these prop- 
erties: 
( 0<r<co; 
(1) the series converges uniformly and absolutely on each compact 
subset of the interval |x| <r; 
(ili) the series does not converge for any x with |x| > r. 
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Theorem 6. If is the radius of convergence of the power series 


x a,X", 
n=0 
then 
(5.12) + = lim sup |a, |", 


Proof. If the series converges at x = t, we have 


ia eA, n>N. 


Thus 
ja," <7 n> WN. 
Therefore if 
1, = lim sup ja, |!” 
we have ᾿ 


ΠΝ 
r 


Now we want to prove the reverse inequality, r > 1/L. We show that if 
0<¢< 1/Lthenr >t. That will do it. Given such a number ¢, choose c 
so that t < c < I1/L. By the definition of lim sup, there is an N such that 


ja," <—, n>WN. 
Thus 
[6,1 -- (2 n>wN 


and since t/c < 1, that shows that the series converges at t. Hence r > t. 
This completes the proof of the theorem. 


Suppose we have a power series with radius of convergence r > 0. 
Let us look at the function f which is its sum: 


(5.13) f(x) = > a,x", [χΧ]} <7. 
n=0 
Certainly fis continuous, because the polynomials (partial sums) 
(5.14) pale) = Σ ayx* 
k=0 


converge uniformly to f on each compact subset of |x| <r. In fact fis a 
function of class C”, for the following reason. Formally differentiate the 
series to obtain a new power series 


(5.15) 3) μα, χη τ, 
n=0 
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Since 
lim n!/" = 1 


n 


the series (5.15) has the same radius of convergence as does the series for 
f. Therefore, the partial sums of the differentiated series converge uni- 
formly on compact subsets of |x| < r. Those partial sums are 


Dx) = Σ ka,x®. 
k=0 


On each closed and bounded subinterval, p, converges to fand ρ΄, converges 
uniformly. By Theorem 5, fis differentiable and 


f'(x) = lim pix) = Yona,x"!. 
ἰώ n=0 
By the same argument, Κ΄ is differentiable, etc. 


Theorem 7. Each real-analytic function f is of class C~. If 


f(x) = Σ a(x — x,)" 


in a neighborhood of X,, then (in that same neighborhood) the nth derivative 
of f is the sum of the power series obtained by n times (term-by-term) differ- 
entiating the power series for f; in particular 


l 
a, = πτί Ὁ). 


This seems a good point at which to clear up an irritating technical 
question. As we defined real-analytic function, it is required that there 
exist a convergent power series expansion about each point in the (open) 
domain of definition. It is not immediately apparent that a function which 
is defined as the sum of a single convergent power series has that property. 
We need to verify that, if 


f(x) = SY a,x", Ix|<r 


and if ¢ © (—r,r), then near ¢ there is a power series in (x — t) which 
converges to f(x). That series expansion will (of course) be 


| 
fx) = YS OH — ἡ" 
Why is it valid? Well, repeated differentiation yields 


ft) = = k(k -- 1) +++ (k—n-+ la,t*> 


Tes — Sk ΕΝ 
= 0 ὭΞΙΣ (aut 
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Formally, we have then 


Σ λα -- δ" = Σ Σ (ζγ)αμετα — 7 


k=0 n=Q 
= dat + — 0} 
=S(x). 


In the presence of absolute convergence of the double series, this calcu- 
lation will be correct. Replace a, by |a,|, replace ¢ by ||, and replace 
(x — t) by |x —t]. Then make the corresponding calculation. A finite 
sum is obtained, provided 

lt} +|x—tl<r. 


Thus, the series expansion about ¢ is valid on the interval |x — rt] << r — 
[1]: 


Conclusion: Let f be a complex-valued function on an open set U. Then 
f is real-analytic if and only if the domain U can be covered by open sets, on 
each of which f is the sum of a convergent power Series. 


Now, suppose we are given a function f on a neighborhood of 0 and 
we want to know whether or not f can be represented in some neighbor- 
hood of 0 as the sum of a convergent power series: 


y= Σ ax: 


We know two things. First, we must start with a function of class C”. 
Second, the only possible choice of coefficients is 


Lm 
a, = — (0). 
So, given a C”-function f, we write down the formal series 
(5.16) > af ™(Q) x", 


This is usually called the Taylor series for the function Καὶ We ask two ques- 
tions: 


1. Does the series (5.16) converge for any x ~ 0 (1.e., does it converge 
on a neighborhood of 0)? 

2. If the series does converge on a neighborhood of 0, does it converge 
to f? 

One answer to question (1) is provided by Theorem 6: The series (5. Ὁ 
converges for some non-zero x if and only if 
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lim sup 


1/n 
at)| < oo, 


Suppose that the radius of convergence of the series is positive. The series 
still need not converge to f. The standard example of this phenomenon is: 


f(x) = exp (-.): x0 


f(0) = 0. 
This fis of class C” on the real line, and 
f™(0) = 0, WS Oye et 
The associated series expansion about x = 0 certainly converges—but not 


to f. (Incidentally this fis real-analytic on the intervals x > 0 and x < 0.) 


Taylor’s theorem gives an expression for the error involved in ap- 
proximating ἃ real-valued function f by the partial sums of its Taylor series. 
The result states that the nth remainder 


RX) =f) — Σ BSP Ox 
15 given by 


R,(x) = alone 


where 0 < @ < 1. This can be obtained from the mean value theorem. We 
shall use a different form of the remainder, one which is not restricted to 
real-valued functions. 


Theorem 8. Let f be a complex-valued function of class (τὶ on aneigh- 
borhood of 0, and let 


R,(x) =f) — δ ef O)xE. 


k= 


Then 
n+ 1 
R,(x) = x | f+ D(tx)(1 — t)* dt. 
“ 0 
Proof. Let 
ΧΕ! 1 
8.(Χ) = = | ΕΞ dt, k=0,...,n. 
: 0 
Integrate by parts and you will obtain 
k 
g(x) = —E SO) + σι. 10) 


g(x) = f(x) — (0). 
Hence, g,(x) = R,(x). 
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EXAMPLE 7. We should become acquainted with power series and the 
function 
f(x) = log x, x > 0. 
This is a real-analytic function. If x, > 0, we would not expect the power 
series about x,: 
a | 
Σ) -τῦ (Xo xX — Xo)" 
0 AM: 
to converge outside the interval (0, 2x,), because f is unbounded near 
x = 0. Since f’(x) = 1/x (Example 1 of Chapter 4), 


f(x) = hr —1)! 


The formal series about x, is therefore 


log x» + ΣῈ OE κ΄ 


Χο 


n> 1. 


The radius of convergence of this series is plainly x,. Why does the series 
converge to log x on (0, 2x,)? Differentiate the series term-by-term and 
you get 1/x. 

Constant use is made of the series expansion of log about-x = 1. It is 
more convenient to translate by 1 and write 


(5.17) log(l+xy= > a Ix| <1. 
n=1 
This series can be obtained immediately from the series 
I τ ; 
pa [Χ} «1 


by term-by-term integration. That is a legitimate procedure since the 
series for (1 + x)~! converges uniformly on compact subsets of (—1, 1). 


EXAMPLE 8. Once upon a time, there may have been a mystery about 
the function 


1 
f@=prm «ER 


This is a real-analytic function on the entire real line. If we expand in a 
power series about x = 0, it is clear what we must get because 


1 οο 
τι σὸν 
(5.18) Ὶ : 
fag a es [ei 
Furthermore, the series does not converge either at x = 1 or x= —1 


(because the terms don’t go to 0). Why should that be the case? This func- 
tion is not badly behaved at one of the end points as is the function 
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log (1 + x). Yet it appears that the power series thinks that the function is 
badly behaved at the boundary of the interval. This is because the series 
doesn’t know that we are only using real values of x. We can just as well 
obtain (5.18) for complex x, and the series recognizes that (1 + x?)~! is not 
at all well-behaved at the points x = -+-/ in the complex plane. 


Exercises 


1. Expand (1 -+ x?)~! in a power series about x = 1. 


2. What is the radius of convergence of the power series 
Sex? 
n=0 


3. What is the radius of convergence of the power series 


x nt x"? 
n=0 
4. Ἰορ (1 +x) <x. 
5. Let 
f(x) = x", —-1l<x<l. 


Is f of class C”? Is f real-analytic? 


6. Give an example of a sequence of real-analytic functions which converges 
uniformly to a function which is not real-analytic. 


7. If fis real-analytic on a neighborhood of 0, 


f(x) = Σ ax" 
define 


oo 


MFI =_& [αν] 


If ||| £. —F,||| > 0, does {f,} converge uniformly on a neighborhood of 0? 
8. If fis a function of class C” on the real line and if f(0) = 0, then 


0.) 
5α)--ὐἦ χ᾽ a 
(0), x =0 


is of class C®. 


9. Extend Exercise 8 as follows. If fis of class C”, then either 


for every 1 or we can write 

f(x) = x"9(x) 
where g is of class C~ and 2(0) + 0. If fis real-analytic and f + 0, the first pos- 
sibility cannot occur. 


19] 
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10. Let f be real-analytic on an interval about the origin. Suppose that there 
exists a sequence of points x, + 0 such that 


S(x,) = 9, n= 1,72. 3.266 
lim x, = 0. 
Prove that f = 0. 


11. Show that the product of two real-analytic functions is real-analytic. Hint: If 
f(x) = Σ aux", 8(.) = Dy bax" 
on the interval |x| <r, show that 


Σ (Σ Ιαιρ! δια ]} [2 < 00 


n=0 = 
for each x in the interval (—r, r). 


12. Show that the quotient of two real-analytic functions is real-analytic off the 
set where the denominator is 0. Hint: If g(0) = 1, then 


11 


where A(x) is small near x = 0. 


13. Show that the composition of real-analytic functions is real-analytic. 


5.4. Multiple Power Series 


This is a concise discussion of power series in n variables and of the 
functions which can be represented by such series. The situation does not 
require new results or techniques, provided one adopts a tractable nota- 
tion. 

A power series (about the origin) in n variables has the form 


(5.19) a, 


where each a,,.,, is a complex number. The multiple subscripts are 
cumbersome; hence, we rewrite (5.19) as 


(5.20) 2 4X" 


where k = (k,,...,k,) 1s a vector index (an n-tuple of non-negative 
integers) and where 
Dee he as oti 


There is a potential ambiguity in the meaning of the series (5.20), because 
we did not stipulate the order in which the terms are to be summed. That 
problem needn’t worry us too much, since we will be concerned with 
multiple power series only on sets in R” where they converge absolutely: 
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(5.21) > |a,X*| < οο. 


k 
The meaning of (5.21) is not ambiguous and has nothing to do with the 
order in which we sum. It means that there is a number M < co such that 


= |a,X*|< M 
KEK 


for each finite set of (vector) indices K. In the presence of absolute con- 
vergence, the terms in (5.20) may be summed in any order—the result will 
be the same. The order which many people regard as most natural is this 
one. 

Let 


(5.22) PAX) = ou a,X* 


so that P,(X) is the sum of all the terms a, X* for which the degree |k| = 
k, +.--- +k, is equal to N. Then we rewrite the series: 


(5.23) 3 a,X* = Σ᾽ Py(X). 
k N=0 
The latter series has a well-defined meaning. It has partial sums 
Sy(X) = P(X) a ew + Py(X) 
= δ᾽ a,X* 


|KISN 


(5.24) 


and it specifies one order in which to sum the terms of the power series, if 
it converges absolutely. 


EXAMPLE 9. The power series 
DA Dp ee 
k 
converges absolutely if | x,| < 1 for each i. Clearly 


(5:25) fares Lael <UL + dag) be tb) 


LKISN 


so that 
DIX ALS — ford? + [aD 


A bit more thought along the lines of (5.25) shows that 
ΣΧ = (1 — x τ... ( -- χρ τι. 
k 


Suppose that the series (5.20) converges absolutely at the point T = 
(t,,--->5¢,)- If |x,| < |t,| for each index 7, then 


Si bagX*] = Sages = Lz 
= Dale ΠΠτ| IP sat. | 


<I οο. 
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Accordingly, the series converges uniformly and absolutely on the box 
le Fa ie ey es ener 2 

The radius of convergence of the series is the largest r, 0 - r< οο, 
such that the series converges at each point in the symmetric box {X; | x; | 
<r,i=1,...,mn}. One might expect a vector radius of convergence r = 
(ri,...+,V4,) to be definable, extending as far out as possible in each coor- 
dinate direction. Unfortunately, such a concept does not make sense. For 
instance, a power series δὴ a,,x/y* in two variables may converge in the 
rectangle |x| < 1, |y| <4 and in the rectangle |x| < 4, |y| <1 but not 
converge (everywhere) in the rectangle |x| < 1,|y|< 1. For such a series 
the radius of convergence is 4. This should explain why we used symmetric 
boxes. 


Definition. Let f be a complex-valued function on a subset of R®. We call 
f a real-analytic function if 


(1) the domain of f is an open set U in Ἀπ; 
(11) for each X, © U there is a power series 


>) a,(X — Χορ) 


which (is absolutely convergent and) converges to f in a neighborhood of Χο. 


Theorem 9. Each real-analytic function f is of class C”. If 
(5.26) f(X) = >) δι(Χ — X,)* 


k 


then the partial derivatives of f are obtained from term-by-term differentia- 
tion of the power series; in particular, 


(5.27) —_ a ΡΟ. 


Proof. If (5.26) holds in a neighborhood of X,, then the radius of 
convergence of the power series is some number r > 0. If we fix values 
x,=t, for ix~j, then U(t,,... 5 tj-15 Xj ἔγεινεν ἴω) IS a real-analytic 
function of x,. Theorem 7 makes it clear that all partial derivatives of f 
exist and are obtained from term-by-term differentiation. The formula 
(5.27) for the coefficients is then immediate. We remind the reader what the 
notation is: 


k!=k,!--- Κα! 

D‘f = Dit... Disf 
δι"! 

~ Oxkt --- Oxk 


The attempt to expand a given function in a power series follows the 
pattern of the one variable case. There is a Taylor’s formula for the 
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remainder after the Nth degree approximation of a smooth function by its 
Taylor series. Again, we confine ourselves to the integral expression for the 
remainder. 


Theorem 10. Let f be a function of class CN*' on U, an open convex 
neighborhood of the origin in ΒΝ. If 


Ry(X) = £00) — 3 py D*F)OX* 
then 


RMX) =(N+0 Ye a | (Κα) — t)® dt. 


Proof. Fix an X in the given neighborhood of 0. Let 
gt) = fx), O<t<l. 


Then g is a function of class C¥*! on the interval [0, 1]. We have 
8.(ἢ = By xD ΓΧΙΧῚ) 
5. (ἢ = Σ x;X (D:D; ΓΧΙΧῚ 


g(t) — pay Xi, ἐῶσι χι Ὁ, ὍΣ D,,, f (tX). 
This is valid for MVM = 0, 1,..., N-+ 1. Each monomial 


Xi, eee Xing 
is of the form X* with |k| = M. Givenk,,...,4, withk, + --- +k, = 
M there are M!/k! different M-tuples (i,,..., i,,) 1n which 1 occurs k, 


times, 2 occurs k, times, etc. Thus, 
! 
gO) = Σ ΤΠ ΧΙ ΡΓΧΙΧ), 0S ΜΈΝΕΙ. 
[κίξμ Κ.: 


We now apply Theorem 8 to the function g: 


2(1) = ΡΣ ΕΟ ἧς mil etry L — ὧν dt. 


Simply observe that g(1) = f(X) and 


FOO — Σ ἕπεο = RMX) 


1 
= ΕΞ | 2X DL — 2% dt 
χ 0 


=(N+) Ee | (Df \AX\L — ὃν dt. 


195 


196 


Sequences of Functions Chap. 5 


Corollary. If f is a function of class CN*! on a neighborhood of the 
origin in R® and if f(0) = 0, then 


f(X) = x,f,(X) + -τ Ὁ + x,f,OX) 


where f,,...,f, are functions of class CN on a (possibly smaller) neighbor- 
hood of 0. If f is of class C”, the functions f,,...,f, may be chosen to be 
of class C. 


Proof. Apply the theorem in the case N = 0: 
" 1 
FX) — ΤῸ) = Xx, | (DiS VX) at. 


Since D,f,..., D,f are of class ΟΝ, each function 
i 
F(X) = | (Dy ΠΧ) αἱ 


is of class ΟΝ, If fis of class C”, each D, (hence, each f;) is also. 


Exercises 


1. If fis real-analytic on U and g is real-analytic on V, then 
AX; Y) = f(X)g(Y) 
is real-analytic on U x V. 


2. Let fbe of class ΟΝ τ᾿ on a neighborhood of the origin. Show that there exists 
a unique polynomial P of degree at most N such that 
. f(X) — P(X) 
lim “---------- = 0. 
χο «XD 
3. If 
h(X) = ΣΙ: Xi, 7 Χμ» XE R’, |X| <1 


I, 


show that (if |k| = M) 
(D‘h)0) = M! 


4. Let fbe a real-analytic function on a neighborhood of the origin. If f(0,..., 
0) = 0, then 
F(X) = ei fi(X) + --- + AX) 


where fi, ...,f, are real-analytic on a neighborhood of the origin. 


5.5. Complex Power Series 


A power series 
Dy α,χ" 
0 


is a formal thing. It is really nothing more than a sequence of coefficients 
a,, together with an indicated intent to plug various objects x into the series 
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and ask whether the resulting series converges. Throughout this book, we 
have been substituting complex matrices A in such series, in order to work 
with ο΄, (J+ A)"'!, etc. There is nothing to prevent us from using the 
obvious series to define sin A, cos A, log (J + A) for| A| < 1, etc. Matrices 
do not represent the ultimate generality in which such substitution is 
interesting or important. We could also allow the coefficients themselves 
to be matrices and all sorts of other mathematical objects. Obviously, we 
cannot discuss all possibilities. But, we are obligated to say something 
about (what is indisputably) the most important case. 
We consider a power series 


(5.28) SS 2,2" 


where the a,,’s are complex numbers, and we discuss the behavior of the 
resulting function of the complex variable z. If the power series (5.28) 
converges for a particular value z = w, then it converges uniformly and 
absolutely on compact subsets of the open disk |z| <|w|. The proof is 
precisely as in (5.11). Thus the series has a radius of convergence r, given by 


= = lim sup |a,|!”. 


The series converges uniformly and absolutely on compact subsets of the 
disk D, = {z;|z| <r}; and it converges at no point z for which |z| > r. 

Let’s consider a power series with a positive radius of convergence 
and study the function which is the sum of the series: 


(5.29) f@= Yaz, ze D,. 


We expect f to be a very well-behaved function. In what sense does f have 
a derivative? It is easy to see that f has partial derivatives of all orders; 
but, let’s delay the discussion of that and proceed by a formal analogy 
with what we did earlier for real power series. Let z, € D,, then 
(5.30) lim LO) = So) _ > pg on. 

Ζ--"Ζο Z— Zo n=0 
One can show this the same way as we did in the real case. It can also be 
done directly, as follows. Let 


pile) =F ἀμ. 
Obviously, 
lim PalZ) = Pn(Zo) Ξ-- > ka,2z5"*; 
k= 


Z—Zo es 20 


Also, 
f(2) = flee) _ plz) = P20) — (gy δ᾽ αἰ(αὶ-- δ) 


Ζ΄ - Ζ2 2 = 2 
(5.31) ° ° ᾿ 


Zo = 75 
| rE 
k=n+1 Z— 29 
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‘es k-1 k-2 k-1 
Now Σ. ζ΄ Ξ-Ξ-Ξ | z ὌΝ Ζ Z0 -- Oa + Zo | 
— £9 


<a kek! 
where c = max [|z|, |Z) |]. From (5.31) we then see that 
lim PAZ) — Py(Zo) ot f() — fo) 
n ys Z0 Ζ aa 20 


and the convergence is uniform on a neighborhood of z,. Therefore we 
may interchange the two limits and obtain (5.30). Now we are ready for 
two definitions and a theorem. 


Definition. Let f be a complex-valued function on a subset of C. Then f 
is a complex-analytic (or holomorphic) function if 


(1) the domain of f is an open set D in C; 
(11) for each point z, < D, there is a power series about Zp, 


Σ a,(z — 24)" 
which converges to f(z) in a neighborhood of Ζο. 


Definition. Let f be a complex-valued function on an open set Ὁ in C. 
Ifz, € D, we say that f is complex-differentiable at z, (or, has a complex 
derivative at z,) if the limit 
(5.32) f’(z5) = lim 12) = (20) 

τοῦ “Le Ζ0 
exists. 


Theorem 11. A complex-analytic function has complex derivatives of ail 
orders. If 


[(2) = Y a,(z — 2)" 


in a neighborhood of Z,, then (in that same neighborhood) f™, the nth com- 
plex derivative of f, is the sum of the power series obtained by n times formal- 
ly differentiating the power series for f; in particular, 


Ι 
ἃ. ΞΞ =f (2). 
The same formal argument which we used for real-analytic functions 
shows that a function fis complex-analytic if and only if its (open) domain 


can be covered by open sets, on each of which fis the sum of a convergent 
power series. In other words, if 


f= Σ αι", [2] «τ 


and if |z,| <r, then 


Sec. 5.5 Complex Power Series 199 


f= Sasa — 20}, |z— zl <r — Iza 


Thus, we know some complex-analytic functions: 
(i) polynomials: 
P(z) = ay + a,z+ +--+ 4+ 4,2" 
(1) the exponential function: 


τες, See 
= ΡΣ aT 
(111) the sine function: 


7 aes — (—1)" ant 1 
sin z = p> Ona)! ni 


(iv) the cosine function: 


Their complex derivatives may be computed using term-by-term differen- 
tiation of the power series. Hence if 
{@™=e 
g(z) = sinz 
h(z) = cos z 
we have f’ =f, 2’ =h,h'’ = —g. 
Soon, we shall list many ways in which known analytic functions can 
be combined to yield new ones. 


Theorem 12 (Identity Theorem). Let f be a complex-analytic function 
on a connected open Set D in the plane. Suppose that there is a sequence of 
distinct points z, © D such that 


(i) £(z,) = 0; 


(ii) {z,} converges to a point Z, in Ὁ. 
Then f = 0. 


Proof. For convenience, assume that z, = 0. We have 
f(z) => a,2", rae ἧς 
Since f(z,) = 0 and lim z, = 0, the constant a, = f(0) is 0. Let 
a2) = Dy yz", |z[<r 


so that f(z) = zg(z). Now 0 = z,g(z,,), and since Z,, Z,, Z3,...are distinct 
points, not more than one of them is 0. Conclusion: g(0) = 0, 1.e.,a, = 0. 
Now divide g(z) by z and repeat the argument. By induction, we obtain 
a, = 0 for all ἡ. 
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What we just showed was this: If z, Ε D and if there is a sequence of 
distinct points z, with f(z,) = 0 and z, = limz,, then f(z)=0 ona 


neighborhood of z,. Let N be the set of all points z, Ε D for which sucha 
sequence {z,} exists. We just showed that N is open relative to D. Obvi- 
ously, N is closed relative to ἢ. By hypothesis, Ν is non-empty. Since D 15 
connected, N = D (which tells us that f = 0). 


To the reader who understood what we did with real power series, the 
things we have had to say thus far about complex power series are cause for 
little excitement. Although we did not state it earlier, the identity theorem 
is obviously valid for real-analytic functions on an interval. We indicated 
that the study of complex-analytic functions is important. It is also a 
fascinating subject which has quite a different flavor from the sort of anal- 
ysis which we have been discussing. At this point, we begin to see why. 

If we were to proceed as in Section 5.3, it would seem natural for us 
now to prove a complex version of Taylor’s theorem. There is no need for 
such a theorem. One of the remarkable facts in mathematics is this: If f 
is a complex-differentiable function, then f is complex-analytic. That is, 
the existence of just the first complex derivative implies that (/ has deriva- 
tives of all orders and) f is locally the sum of a convergent power series. 
This result, which we shall prove later when the derivative is known to be 
continuous, contrasts sharply with the situation on the real line, and it 
suggests that perhaps we should take a close look at the connection be- 
tween real and complex derivatives. 

Suppose that fis complex-analytic near the origin in the plane: 


f(z) =} a,2’, ΖΕ 2D: 


Identify C with R? in the usual way and regard fas a complex-valued func- 
tion on R2, i.e., regard f(x + iy) as a function of the two real variables 
x, y. Certainly f is a smooth function on R?. In fact it is a real-analytic 
function on R2, because we can expand f(x + iy)in a double power series 
in x and y. Use the binomial theorem on each term z”: 


7 + iy) = Σ α,(χ + ivy 
(5.33) =a (Gy 


ΞΞ k 
= ps ChpX γ᾽ 
k,p 


Ckp -- a z Pye. 


The last equality in (5.33) is legitimate if the series 


where 


k 
> CrpX y? 
k,p 
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is absolutely convergent, for then, we may regroup its terms in the form 
> a,(x + iy)". Now 


dal exol lx }} = 2 lanl] + [yD 


(Regrouping is always legitimate for a series with non-negative terms.) 
Thus the double series is absolutely convergent if |x|-+ |y|< yr, and 
therefore f is a real-analytic function of x, y on a neighborhood of the 
origin. 


EXAMPLE 10. Not every real-analytic function of x, y is complex- 
analytic. There are so many ways to see this fact that it is difficult to 
choose one. Perhaps the most obvious one is to let 


g(x + iy) = x. 
It should be clear that x = ¥) a,(x + iy)" 15 a bit difficult to arrange 
through any choice of the constants a,. We could also show that g fails to 
have a complex derivative: 
lim £62) = 8) _ jim X. 
z—0 Ζ z-0 Ζ 
If the origin is approached along the imaginary axis, xz~! tends to, in 


fact is, 0; but, as the origin is approached along the real axis, xz~! tends 
to 1. 


If fis complex-analytic, the fact that f(x + iy) is (locally) the sum of 
a power series in x and y tells us in particular that 


f 
R?—>C 
is of class C”. There is another way to show that fhas partial derivatives of 
all orders, and it provides a description of one extra condition which a 
smooth function on R? must satisfy in order to be complex-analytic. In 
presenting this, we warn the reader not to confuse the complex derivative 
f’ with the gradient of αὶ 
If the complex derivative 
“z\)= lim £ — f (Zo) 
f'@o) = lim ᾿Ξ, 

exists, then the limit exists as z approaches z, along any line through Zp. 
The derivative of f at z, in the direction of the vector (complex number) 
w 15 


(D,f)(Z0) = lim Lo + tw) ~ Slo) 


— whim So Ὁ ἵν) -- fo) 


1-0 tw 


= wf"(Zo). 
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In particular, 


of ; 

(5.34) ax / 

_ 
dy 


If f is complex-analytic, then f’ exists and is itself complex-analytic. 
Repeated application of (5.34) shows that fis of class C”. We have also 
picked up an important piece of information about complex-differentiabil- 


ity. 


Theorem 13. Let f be a complex-valued function of class C! on an open 
set U in the plane. The following are equivalent. 


(i) f is complex-differentiable on U. 
-. Of  .of | 
(i1) ὅχ ae ay = 0 (throughout VU). 
(11) J/u = Ref, v = Im f, then 
δι ὃν δι ὃν 
π and Pale (throughout VU). 
Proof. In (5.34) we just showed that (i) implies (1). Replace f by u + 
iv in (1) and remember that a complex number is 0 if and only if its real 
and imaginary parts are 0. You will see immediately that (11) and (iti) are 
equivalent. Suppose that (ii) holds. We shall show that 715 complex-differ- 
entiable at z, Ε U. For convenience, assume that z, = 0. Since / is of 
class C', we have 


fle +d) — fO=xf (Di fez) dt + ν [ Dafyez) at 


for all z = x + iy near the origin. This is a special case of Theorem 5.9; 
or, it may be viewed as an application of Theorem 5.7. But, all it amounts 
to is the fundamental theorem of calculus applied to the function g(t) = 
f(tz) on the interval 0 < t < 1. From the differential equation (11), which 
relates D, f and D,f, we have 


f@) -- ΧΟ =z { (δ, 7) a. 
Thus 
lim πὸ Ξ εἰ | (D, f\(tz) dt 


and since D, f is continuous, the last limit is (D, f)(0). This shows that f 
is complex-differentiable and (in the process) that (5.34) holds. 


The equations in condition (iii) of the theorem are known as the 
Cauchy-Riemann equations and (11) is known as the Cauchy-Riemann con- 
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dition. This condition 


of ie = 
ay -|- i 0 
is a simple way of phrasing tate accede for a function of 
class C!. If fis of class ΟΣ (which is automatic but we don’t know that yet), 
we can differentiate the Cauchy-Riemann equations: 


of Of 
Ox? ae Oxoy 
Of Γ᾽ 
ὃν Ox a oy? =e 
and, since the mixed partial derivatives are equal, we obtain 
9) “Οὐ. 
(5.35) 513 -+ gone: 


The differential equation (5.35) is called Laplace’s equation and its solu- 
tions are called harmonic functions. If f is complex-differentiable and of 


class C?, then the real and imaginary parts of 7 (as well as 7) are harmonic 


functions. 


Exercises 


1. If fis complex-analytic on a neighborhood of w, then 


_ f(z) — fw) 
δι: Seay 
is complex-analytic on that same neighborhood. 


2. If fis complex-analytic and non-constant on a connected open set D and if 
w € D, then for some a 


f(z) — f(w) = @ — w)re(z) 
where g is complex-analytic on D and g(w) + 0. 


3. Let g be a piecewise-continuous (or an integrable) function on [0, 1]. Then 
1 
Ν g(x) 
F(2):= Ι. “Ὁ 5 εἶχ 


is complex-analytic off the interval [0, 1]. 


4. Suppose fis complex-analytic on a disk D. If fis real-valued, then fis con- 
stant. 


5. Let f be a complex-analytic function on a connected open set ὃ. If f’ = 0, 
show that fis constant. 


6. The function f(z) = e? is (of course) complex-analytic on the entire plane. 
Verify by direct calculation that the real part of fis harmonic. 
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7. Introduce polar coordinates in the plane by x = r cos 0, y = rsin @. Verify 
that the Cauchy-Riemann condition can be expressed this way: 


Of . .of 

--- = = 0. 
ror 128 

8. Let f be a complex-valued function of class C2 on an open set in the plane. 


Prove that f is complex-differentiable if and only if f(z) is harmonic and zf(z) 
is harmonic. 


9. Let u be a real-valued harmonic function. Show that, if u2 is also harmonic, 
then u is constant. 


10. Use a power series to define 
log (1 + 2). [2[| --:ἨὀβἨ;:. 


Of course, you want exp (log (1 + z)) =1-+ Ζ. Refer to Example 7 and be sure 
to verify that exp (log (1 + 2)) ΞξΞ 1 - Ζ. 


11. Show that it is not possible to extract the square root of z analytically on a 
neighborhood of the origin. In other words, prove that no complex-analytic 
function f satisfies f(z)? = z on a neighborhood of the origin. 


*12. Show that it is not possible to extract the square root of z continuously on a 
neighborhood of the origin. 


13. Show that it is possible to extract the square root of z analytically in a 
neighborhood of z = 1. (Hint: See Exercise 10.) 


14. Use a power series to define (for complex matrices A) 
log (J + A), |A| <1. 
Show that exp (log (7 + A)) = I+ 4. 


15. If N is a k Χ αὶ matrix which is nilpotent (N* = 0), then 7 + N = e? for: 
some complex matrix B. 


16. If you know what the Jordan canonical form for a complex matrix is, use it 
to prove that the exponential map 


A —>e4 


maps the Καὶ Χ k complex matrices onto the invertible k x k matrices. 


5.6. Fundamental Results 
on Complex-Analytic Functions 


Now that we understand a bit about complex differentiation, we’ll 
begin to sample some of the elegant methods and results in the study of 
complex-analytic functions. A number of special properties of complex- 
analytic functions can be derived from an elementary one, known as the 
mean value property. 
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Lemma. If {z;|Z— Z| < p}is a closed disk contained in the domain 
of the complex-analytic function f, then 


(5.36) f(z) = -- Ι  (Ζο + pe) dO. 


Proof. The hypothesis is that fis complex-analytic both inside and on 
C,,, the circle of radius p about the point z). We are asked to show that the 
average (or mean) value of f over C, is equal to the value of fat the center 
of C,. It’s not difficult to see how someone guessed that such is the case. 
For simplicity, take z, = 0. Then 


F(@) = Do α,5", Izl|<e 
or 


S(re®) = >> a,r'e'”®, O0<r<e. 
n=0 


π 0, μ- 0 
εἰ | e'”? 40 = 
2G Jo Ι, n=0. 


> 


Now 


If r < ε, then (by the uniform convergence of the series) 


1 ᾿ id pont = seal . ἐπθ 
ὅπ Ἂ (re) dd = x a,” στ [ie dQ 
ΞΞΞΞ Qo 
= f(0). 


It appears that we are finished; however, we are not quite. The hy- 
pothesis is that fis analytic for |z| < p. We have shown that 


f(0) = ᾿ | : fire®) dO, or<e 


where ε is the radius of convergence of the series expansion of f about the 
origin. It is a theorem that, since f is complex-analytic on |z|< p, the 
series expansion for f about 0 must converge on |z| < p; but, we don’t 
know this yet. We have to face the possibility that ε < p. 

Define 


(5.37) I(r) = -- | ; f(re®)d0, O<r<p. 


We'll verify that J(r) is constant, by showing that 


or 
a=0. 
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Now 75 is complex-differentiable and therefore has radial and angular 
partial derivatives: 


t—rPro. aA Po 
= e'f"(Z9) 
τ (20) = izof (20): 


Thus the complex-differentiability implies that 


Of , of | 
ap —- [90 = 0 
So 
of 1 Of oi 
Or on - ap ire 6) dQ 
i 1 ᾿ of i 
sates 6 ("e °) dO 
aa! in\ —in 
= = 1rre) — fre) 
== Ὁ, 


Now we know that /(r) is constant. Furthermore, /(r) = f(0) for 
small r. This proves the mean value property. 


Many people prefer to write the integral involved in the mean value 
property as a Riemann-Stieltjes integral with respect to the function z(@) = 
pe’® on the interval [—z, 2]. We have 


dz(0) = 95. ad 


= ipe’® 4θ 
— iz 4θ 


so that (as in Example 9) we can also write the mean value property this 
way: 


sO= οἱ, [ Mae 


This is a very useful way to write the mean value property and it is in many 
respects a more “natural” way when dealing with analytic functions. As it 
stands, the notation has a defect in that it does not indicate which circle 
|z| = pis involved. In other words, in order to regard z as a function of 0 
we had to choose some p and fix |z| = p; the notation for the integral 
should reflect this choice. Accordingly, if g is any continuous complex- 
valued function on the circle |z| = p, we define 
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| τ 86) & 
to be the integral 
[ s@4@, z= pe®. 


The reader will probably recognize 


i Ρ(2) dz 


as the line integral over the (counterclockwise oriented) circle |z| = p of 
the differential 1-form 


g(z) dz = g(x, y)(dx + idy). 
At the moment it is merely a piece of shorthand. 


Theorem 14 (Cauchy). If f is complex-analytic on (an open set which 
contains) the closed disk of radius p about the origin, then 


(5.38) f(w) = 5h | " +e) dz, |wil<p. 
Proof. Fix w. Let 
g(z) --Ζ. Le) — fw) = L (w) 


Of course, we define g(w) = wf’(w). Then g is complex-analytic where f is. 
(Exercise 1 of the last section.) Apply the mean value theorem to g: 


0 = g(0) 
2ni Izl=p 2 
το {2g _ 1 dz 
Oni | — πη J) - 2πὶ | os —w 
Simply calculate that 
] ] ἘΣ 
Oni πο lwl<p 


and Cauchy’s formula follows. The power series argument in the proof of 
the next corollary should make it clear how to carry out this calculation. 


Corollary. Let f be a complex-analytic function on an open set D, and 
let Z, be a point in D. The power series expansion for f converges to f in the 
largest disk which is centered at z, and contained in D. 


Proof. We may as well assume that z, = 0. Suppose that the closed 
disk |z| = p is contained in D. We want to show that the power series for 
f about the origin converges to f on the disk |z| < p. Look at Cauchy’s 
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formula for f(w), given in (5.38). If|z| = p and |w| < p, then 


[owl oP 
z—w z 1—z!w 
2 
“Heel ν.}) 
Zz Ζ Ζ 
2 yh 
Se, 24 pri 
and so 
“ΕΒ f(z) 
fw) 2 2 3,,25Ξ τὰ 
= x" a,w" 
n=0 
where 
Ι 
(5.39) ΓΞ Ὁ L2) gy, 


The last corollary removes an irritating detail which bothered us be- 
fore. It is not, however, a minor detail. To understand this, the reader 
should think very carefully about what the corollary says, starting from 
our definition of complex-analytic function. 


Theorem 15. Every (continuously) complex-differentiable function is 
complex-analytic. 


Proof. The sum, product, and quotient of complex-differentiable 
functions are complex-differentiable (where defined). That follows readily 
from the definition of complex-differentiable, by the same arguments one 
uses for differentiation on the real line. 

In this theorem, we are given f, a function of class C! on an open 
set in the plane which is complex-differentiable, i.e., which satisfies the 
Cauchy-Riemann condition. We want to show that fis (locally) the sum of 
a convergent power series. We may assume that the domain of f contains 
the closed disk of radius p about the origin and content ourselves with 
showing that f is analytic on that disk. Let w be a point with |w| < p. 
Define | 
f@-—f) say 


CS a caer aE 


g(w) = wf'(w). 
Then g is (obviously) continuously complex-differentiable, except at w; 
and g is continuous at w. We claim that 


g(0) = - | g(re®) dé, r<p. 
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To prove this, let /(r) be the integral on the right. Then J(r) is continuous 
on O<r< p and differentiable except possibly at r= |w|. Since g is 
complex-differentiable except (possibly) at w, 


Og = 
Lt an ὃ γ Ξε ]ν!. 
Thus, 
dl 


as in our proof of the mean value property for analytic functions. So I(r) 
is constant. Obviously, 


lim I(r) = g(0). 


Consequently, we have proved the mean value property for g: 


8(0) = io | ; g(pe'’) do 


82) gy, 
[ : 


τ 2πὶ Ζ 


This says that 


0 = g(0) 
1 f(z) = "ΕΝ dz 
ie) ee eae 
adel f(z) = 
— a te tw). 


So Cauchy’s formula is valid for f. That integral formula shows us im- 
mediately how to expand fin a power series about the origin. 


We remarked earlier that Theorem 15 is valid without the assumption 
that /’ is continuous, 1.e., just the existence of the complex derivative (on 
an open set) implies analyticity. The sort of proof we have given can be 
refined to cover that case; however, the refinement tends to cloud the 
simplicity of the arguments, so we have omitted it. 

Here is another basic corollary of the mean value property (or 
Cauchy’s formula). 


Theorem 16. Let D be an open set in the plane and let {f,} be a sequence 
of complex-analytic functions on D. If f,, converges uniformly to f, then f is 
complex-analytic. 


Proof. Analyticity is a local property; hence, we need only prove the 
theorem when D is a disk centered at the origin. The Cauchy formula 
states that 
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fw) = gh (LD a 


LTE Jove 2 
= ae ee: tee 
= aq | Leesa e 


if |w| < rand ris less than the radius of D. Fix an r and a w. Since f,(re’”) 
converges to f(re), uniformly in 9, and since f,(w) converges to f(w), we 
have 


5 JM) ie. 


PTE) ΠΕΣ. 
Therefore, f is complex-analytic on the disk |z| <r: 
fw) = Σὲ a,w" 


where 


a ΞξΌό —=— 
7 Oni geet 


| a. 
[zl=r 


(See the proof of the previous corollary.) 


Corollary. Let f, g be complex-analytic functions. Then (where defined) 
the sum f +- g, the product fg, the quotient f/g, and the composition fo g 
are complex-analytic functions. 


Proof. This result follows rather easily from Theorem 15 and the 
corresponding result about sums, products, etc. for complex-differentiable 
functions. It is informative to think of it this way. Theorem 16 tells us 
that a function is complex-analytic if and only if it is locally the uniform 
limit of a sequence of polynomials p,(z). Since sums, products, and com- 
positions of polynomials are polynomials, it follows that f+ g, fg, and 
fog are analytic if f and g are. The function 1/z is analytic for z σέ 0: 


b>. Ι 
2 Zo + (z= Zo) 


--1 


fs Ζο 

“Ts zo (2 = Zo) 

= ( Ἰ Ζεσηθα — σὺ" [2 -- Ζο] <I20l 
Thus (by composing with |/z) //g is analytic off the zero set οἵ g. 


It follows that the class of real-analytic functions is closed under sum, 
product, quotient, and composition (where defined). That is because the 
power series for such functions about x, can be used to extend them to 
complex-analytic functions in a neighborhood of x, in the complex plane. 
But note that the theorem on uniform convergence breaks down com- 
pletely. It is a famous theorem (which we shall soon prove) that any con- 
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tinuous function on a closed interval can be uniformly approximated by 
real-analytic functions. Thus, for power series, there is an enormous differ- 
ence between uniform convergence on an interval and uniform convergence 
on a disk. 

Let us summarize briefly what we know about complex-analytic func- 
tions. We have proved that, for a complex-valued function f on an open 
set D in the plane, the following conditions are equivalent 

1. f is complex-analytic; i.e., at each point z, Ε D there is a power 
series in (Ζ — z,) which converges to f(z) in a neighborhood of Zp. 

2. f is complex-differentiable. 

3. If f= μ - iv, where μ, v are real-valued, the map (x, y) — (ὦ, v) 
from D into R? is of class C! and satisfies the Cauchy-Riemann equations: 
Qu ov Ou = ὃν 

Ox oy ὃν Ox 


4. If the closed disk |z — z,|< p 15 contained in D, then 


fw) = x | Oa, Perce, 
|z-zol=p 


2ni zZ—w 
That is, 
fw) = Yaw — zy, |w—z01<p 
where 
a ie f(z) 
4, = FF ἢ, (2 Zo 


5. If z, is any point of D, there is a sequence of polynomials p,(z) 
which converges to f(z), uniformly on some neighborhood of Zp. 


We only verified that condition 2 implies analyticity under the as- 
sumption that f’ is continuous. 

We also showed that each complex-analytic function has the mean 
value property on circles. It is not in the list above because it does not 
characterize complex-analytic functions. (In fact, it characterizes harmonic 
functions.) There is, however, a stronger form of the mean value property 
which does characterize analytic functions. If the disk |z — Ζο] < pis con- 
tained in the domain of the analytic function f, then g(z) = (z — Z,)f(z) 
is complex-analytic on|z — z,| < pand satisfies g(z,) = 0. By the Cauchy 
formula 


0 = g(Z0) 
ae 9 _8(Z)_ ay 
2ni lz~zol=p 7 ~~ 70 


1 
= στ | fe dz. 
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In other words the line integral of the differential 1-form f(z) dz over any 
circle y = {z;|z — z,| = p} contained in D is 0: 


| f@ dz = 0. 


We can add to the list of characterizations 1-5: 


6. If y is the circular boundary of any closed disk in the domain D, 
then 


Ϊ f(z) «2 = 0. 


Why does condition 6 imply analyticity? We shall deal with this in 
the case where fis known to be of class C'!. We borrow the result from the 
calculus of functions of two variables known as Green’s theorem: If P 
and Q are real or complex-valued functions of class C! on (a neighbor- 
hood of) the closed disk A, and ify is the (circular) boundary of A, oriented 
counterclockwise, then 


| (P dx + Q dy) = (5. ἢ aca. 


Apply this with P dx + Ο dy = fdz = fdx + if dy and one has 


| toa = Ι. (16 — 2) ax dy 


Thus condition 6 says that the continuous function i(df/dx) — (Of/dy) 
has the property that 


| (i —£) ax dy =0 


Ox oy 
for every closed disk A in the domain D. This implies that 
of _ of 
"Ox Oy : 


which is the Cauchy-Riemann condition for complex-differentiability. 

At this point we should remark that conditions 4 and 6 are valid more 
generally. If fis analytic and y is any simple, closed rectifiable path in D, 
then 


cores eee 2.10) ar 
(Cauchy’s formula) fw) = “πὶ ) 2—w dz (w inside y) 


(Cauchy integral theorem) { f(z) dz = 0. 
7 


Loosely speaking, a simple, closed rectifiable path is one which winds 
around each point inside it exactly once. It would take us too far afield to 
discuss these matters precisely at this point. 
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Exercises 


1. If fis a complex-analytic function on a disk D, then f = g’, where g is com- 
plex-analytic on D. 


2. Let f be a function which is complex-analytic and has no zeros on the disk 
Ὁ. Show that f = e*, where g is complex-analytic on ἢ. Hint: What differential 
equation must g Satisfy? 


3. If f is complex-analytic and bounded in the unit disk |z| < 1, then 


2 
- οὐ, 


Δ £0) 


Hint: What is 
aq | \Fcrep a? 


4. If f is complex-analytic on |z| < p, then the complex derivatives of f are 
given by 


f™(w) = πὶ | .ς ὦ dz Ι»] < p. 
| 


z—wytl 3 
ΣΎΝ, ) 


5. (Liouville’s theorem) If f is bounded and complex-analytic on the entire 
plane, then fis constant. Hint: Use Exercise 4 to look at f’(w). Think big (cir- 
cles). 


6. Let f be a function which is complex-analytic on the punctured disk 0 < [2] 
<r. If fis bounded, show that fcan be extended to a complex-analytic function 
on the disk |z| <r. Hint: First show that z?f(z) is analytic in |z| <r. 


7. Use Exercises 5 and 6 to show that, if f is complex-analytic on the entire 
plane and if there is a constant K such that 
\f@|< ΚΙ ||, for large |z| 
then fis a polynomial of degree not more than n. 


8. Let {f,} be a sequence of complex-analytic functions on D. If f, converges 
uniformly to f, then {||} converges to /”’. 


9. Let f be complex-analytic on the annulus Κι <|z| <r,. Prove that f= g 
+ h, where g is complex-analytic on the disk | z| <r, and ἢ is complex-analytic 
outside the disk |z| < γι. Hint: What integral would you think would define ¢? 


*10. Let 
QHD". zl seal. 


n=0 


If |«| = 1, show that fcannot be extended so as to be analytic in a neighborhood 
of a. 


*11. Let 


n2? 


a= 25 [Iz] <1. 
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Show that fis continuous on |z| < 1, analytic on [2] < 1, but f cannot be ex- 
tended analytically to any point on the unit circle [z| = 1. 


5.7. Complex-Analytic Maps 


One of the most important aspects of the theory of analytic functions 
is concerned with how an analytic mapping 
x 
D—C 
behaves geometrically. We do not have the time nor the tools to go into 
that question in depth; however, a few basic facts are within easy reach. 


Theorem 17 (Weak Maximum Principle). Let f be complex-analytic on 
an open set D, and let V be an open subset of D such that V is compact. There 
exists a point-w on the boundary of V such that 


[f(w)| = sup 2}. 

Proof. Since f is continuous, there exists a point w € V such that 
|fOv)| = max | f(2)! 

= sup | f(z)|- 


Among such points w, choose one which is nearest to the boundary of V. 
(We can do that because the boundary of V is closed.) This w cannot be in 
V, for the following reason. 

If w ε V, choose r > 0 so that the closed disk of radius r about w is 
contained in V. Then f(w) is the mean value of f over C,, the circle of 
radius r about w: 


fw) =e | flow + re) a8. 
Therefore, 
FO) < ας [λῶν + re) 9 


x 


l 
= sup ΔΊ | a 
= sup|f| 
Cr 


< sup | |. 


If | f(w)| = sup | f |, then equality must hold at every step in the chain of 
V 


inequalities just derived. That can only happen if f(z) is constantly equal 
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to f(w) on the circle C,, because | f(w)| — | f(z)| is a non-negative con- 

tinuous function with integral 0. Some point of C, is nearer to the bound- 

ary of V than w is; hence sup | f| is attained at some point closer to the 
V 


boundary of V than w is. This contradicts the choice of w. 
Corollary (Strong Maximum Principle). Let f be a non-constant com- 
plex-analytic function on a connected open set V. Then 


|f(w)| < sup | f(z)|, weV; 


in other words, |f | connot attain a maximum at any point in V. 


Proof. It suffices to prove the result when V is compact. (Why?) From 
the proof of Theorem 17, we see that, if w © V and 


| £()| = sup | £2) 


then fis constant on a neighborhood of w. Since V is connected, that can- 
not happen unless fis constant. 


With a little information about the behavior of the modulus of an 
analytic function, we can prove one of the most basic theorems in mathe- 
matics. 

Theorem 18 (Fundamental Theorem of Algebra). Let 

p(z) = ay + a,z+ +++ Ὁ 8,25 


be a non-constant polynomial with complex coefficients. There exists a 
complex number z such that p(z) = 0. 


Proof. We may assume that n > 1 because the case n = | is trivial. 
We may also assume that a, = 1: 


P(Z) = Ag + αι + +++ F agyZz" | + 2". 
We claim that | p(z)| is big outside big sets, 1.e., that 


Pe) aro 


We have 
n—1 
|p(z)| = [2}5 — ΤΣ nz 
n-1 
> Iz" — ¥ Lael lzt. 
If |z| > 1, then [ΖῬΡ <|zPU,k =1,...,n— 1; hence 


lp(2l[>l|zP—elzrt, [ΖΙΒῚ 
where c = |a,| + --- + ]a,_,|. Thus 
[9.2 ΞΡ lzj[>1t+e. 
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The last inequality shows that | p(z)|-—> co. In particular, we can 
choose r > 0 so that 
(5.40) |pz)|>1+ lal, [zlar. 
Any zero of p must lie in the disk D, = {|z| <r}. Does p have a zero on 


D,? There exists a point w € D, such that 
(5.41) |p(w)| = inf {| p(z)|; z ε D,}<|ao| 


because p is continuous and D, is compact. By (5.40) and (5.41) the point 
w is in the open disk D,. If p(w) 4 0, then 1/p is analytic on D, and | 1/p| has 
a local maximum at w. By the maximum principle, 1/p (and hence p) is 
constant. We cannot have that; hence p(w) = 0. 


Corollary. If p is a polynomial with complex coefficients 
p(z) = ay + ayZ +--+ Ὁ 4,2°, a. 0, 
there exists complex numbers Z,,...,Z, such that 
p(z) = ἃ,(Ζ — Ζι)(Ζ — Z,) +++ (Ζ — Ζ,). 


Proof. This results from repeated application of the theorem plus the 
fact that if p(w) = 0, then p(z) = (z — w)q(z), where q is a polynomial. 


We now establish a result about the zeros of the general analytic 
function. It is an integral formula for counting the zeros. When we count 
zeros, we shall always include multiplicities. That means the following. 
If fis analytic near z, and if f 4 0, we can write 


f(z) = ὦ — Ζ.)"8(2) 
where g is analytic near Z, and g(z,) τέ 0. We call n either the order of the 
zero at Z, or the multiplicity of z, as a zero of f. We then count z, as n zeros 


of f. 


Theorem 19. Let f be a function which is complex-analytic on the closed 
disk [2 — Z| < p with no zeros on|z — Ζοὶ = p. Then 


i f'@) gq 
2πὶ |z~zol=p f(z) ° 
is equal to the number of zeros which f has in the disk [2 — Z)| < ρ. 


Proof. We may assume that z, = 0. If fhas no zeros on|z| < p, then 
f'/f is analytic on that disk; consequently (by the mean value property), 


If £Q, 78 
2ni ΤΣ f(z) F(Z) |=0 
z=). 


So, the formula is correct when / is zero-free. 
Now, take a general Καὶ The identity theorem (Theorem 12) tells us that 
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f, being analytic on an open set which contains |z| < p and not identi- 
cally zero, can vanish at only a finite number of points in that disk. Let 
Z,,..., 2, be the distinct points z such that |z| < p and f(z) = 0. Letk, 
be the order of the zero at z;. Then 
f(z) = (@ — 2;)" +--+ @ — 2 50) 

where g is analytic and zero-free on |z| < p. Compute that 

fe) _¢@, τ κι. 

f@ «ὦ ' Az—z 


By the result from the zero-free case 


] 1 2)... — “ἢ dz 
ani εἴα; JZ) ac p> ᾿ OTE ie 2. - 2) 
- Τὺ κχ 
j=l 


Lemma. Let f be complex-analytic and non-constant on a neighbor- 
hood of the point z,. Let n be the order of the zero of f — f(Z9) at Zo. For 
every sufficiently small number p > 0, there exists € > 0 such that, if 0 < 
la — [(Ζ9}} < €, then there are precisely ἢ points z with [2 — Z)| <p and 
f(z) = a. 

Proof. Let 

D, = {z;|z — Ζοί < p} 
Ο, Ξε 2:}2 -- Ζο »).. p>. 
We choose p sufficiently small that 


(i) f is complex-analytic on D,; 

(ii) f(z) f(z), if z ε D, andz τὰ 2); 

(11) f(z) #0, if z € D, and z F Zp. 
The identity theorem (Theorem 12) tells us that, if (11) were not satisfied for 
small p, then f would be constant near z,. Similarly, if (111) were not satisfied 
we would have f’ = 0 near z, and hence f constant. Note that f’(z,) may 
or may not be 0. 

Condition (ii) and Theorem 19 tell us that 


— f'@ 
|. Sf@e—f JO= fen 


where ἡ is the order of the zero which f — f(z)) has at the point z,. Simi- 
larly, ifa ἐ f(C,), then 


οὐ ν. 
NO) = τὶ “|. γέ Foo” 


is the number of zeros of f— a side C,,, 1.e., the number of times 77 as- 
sumes the value « on the disk D,. Since M(a) is an integer, we will have 
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FIGURE 20 : 

N(a) = n provided that | N(a) — n| < 1. Look at the integrals which give 
us n and N(a). It should be clear that those integrals are less than | apart 
for all ἃ such that [ἃ — Κ(29) < ε, Le., for all ἃ sufficiently near f(z). 

We now have a picture like that in Figure 20 for small p, ε. Of course, 
Δ, = {a;|a — f(zo)| << e}. If a © A., then f(z)=a for exactly n 
points z € D,. Presumably, that counts multiplicities. For instance, if 
a = f(z,), then f(z) = ἃ for n points of D, only because of the conven- 
tion to count z, as an n times repeated zero of f(z) — ζΖ(29). But, for 
points ἃ ~ f(z.) multiplicities do not enter because f~'(a) consists of ἡ 
distinct points. Why? If f(w) = ἃ, then f’(w) 4 0 by condition (111) on 
p; so, f — a has a first order zero at w. 


The conclusion of the lemma can be roughly summarized this way. 
If fis analytic and non-constant at Zz), then near Z, 


F(z) = f(20) + az — Zo + Gna s(Z — ZI + τ.’ 


where a, + 0, and f behaves there much the way that z” behaves near the 
origin. Two basic facts are easily derived from the lemma. 


Theorem 20 (Open Mapping Theorem). Let f be a non-constant com- 
plex-analytic function on a connected open set D. Then f is an open mapping ; 
that is, if V is an open subset of D, then f(V) is an open set. 


Proof. Let z, be a point in V, an open subset of D. We must show that 
f(V) is a neighborhood of f(z). Since D is connected and /is not constant, 
f — f(z.) has a zero at z) of some well-defined order 1, 1 <n. The last 
corollary shows us that, for small p, the image under fof D, = {z;|z — Ζοί 
< p} contains the disk about f(z.) of some radius € > 0. Since D, Κὶ 
for small p, the result follows. 


Theorem 21. Let f be a complex-analytic map 
f 
D-——>C 
on an open set D in the plane. 
(i) If z) < D, then f is 1:1 on a neighborhood of Z, if and only if 


f'(Z,) τέ 0. 
(ii) Jf fis 1:1 0n D, then [Γ: 
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f~! 
D <— f(D) 
is complex-analytic. 
Proof. Statement (i) is a special case of the lemma prior to Theorem 
20. To prove (ii), observe that f is an open mapping. That tells us that 
f(D) is an open set. It also tells us that f~' is continuous. Since fis 1: 1, 
f'(z) #0 throughout D. Now we can see that f~! is complex-differ- 
entiable at each point w = f(z) in f(D): 
lim 7 1(α) — fw) — lim f(a) —z 
a-—w aw a — f(z) 
eee ee 
Γ( 
We also see that (f~')’ is continuous: 
(fw) = (FFU). 


So 5, is complex-analytic, by Theorem 15. 


aw 


EXAMPLE 11. Let f be the exponential function, f(z) = e?. Then 715 
analytic on the entire plane. Since f’ = f, the derivative has no zeros. Thus 
fis locally 1:1. The function fis not 1:1 on the plane, because 


f(z + 2nni) = f(z). 
Evidently, the equation 
w = e? = e*e’” 
can be solved for z if w 4 0; hence, the image of fis the punctured plane 
{fw © C; w ~ 0}. In fact, any horizontal strip 
—oco << X¥ < CO 
Yo< yo + 2a 


is mapped 1:1 onto the punctured plane. Fix wy ~ 0 and select any Zz, 
such that w, = ο΄. Theorem 21 tells us (as is obvious in this case) that 
there is a neighborhood U of z, which is mapped by fin a 1 : 1 manner onto 
V, a neighborhood of wy. Also, the inverse map on V is analytic. What 
does all this say? It says, start with any w, 4 0. Given any logarithm z, 
of wy, we can analytically extract logarithms on a neighborhood of wo, in 
such a way that log wy = Zp. 


4 
Exercises 
1. (Schwarz lemma) Suppose that f is complex-analytic and bounded by M 


on the unit disk: 
If(2)|< M, |z| <1. 


If f(0) = 0, then | f(z)| << M|z| on the unit disk. 
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2. Let f be a complex-analytic function on the unit disk such that 
If@l<1,  |z|<1 
(0) = 1. 
Prove that f(z) = z. Hint: Express f’(0) as an integral over |z| = 1. 
3. Πα] < 1, let 


L(z) = [Ζ2[-ἨἰἨΊ. 


Ζ α 
1 — a*z’ 
Show that L is a 1:1 complex-analytic map of the unit disk onto the unit disk. 


4. Let f be complex-analytic on the unit disk. Let [ἃ] « 1 and suppose that 
7(α) = 0. Then 


f= τξ τὼ 
where g is complex-analytic on the unit disk and 
sup | f(z)| = sup |g(2)|. 
[z|<1 [z|<1 


5. Suppose that fis analytic on the unit disk and | f(z)| << M. If f(a) = 0, 
then 
2--α 


Τ. ofc? [2] <4. 


If equality holds at any point z, then fis a constant multiple of the function L 
from Exercise 3. 


6. Let f be a complex-analytic map from the unit disk into the unit disk. Prove 
that 
S(B) = [(α) |< ΓΞ, β-α 
1 — f(a)*f(B)| ~ 11 -- αἰ β] 
Hint: Use Exercise 5 and the function 
“(Ὁ = fe 
7. Let f be a 1:1 analytic map of the unit disk onto the unit disk. Show that 


712 =e: 


--α 
Ξ QZ 


where |c| = 1 and [ἃ] <1. 


8. Let { f,} be a sequence of complex-analytic functions on D which converges 
uniformly to £ If fhas a zero somewhere on D, there is an N such that f, has a 
zero on D for all n > N. 


9. Is the result in Exercise 8 valid if we change “has a zero” to “has no zero” in 
the two places where it occurs? 


10. Let p(z) be a polynomial of degree n > 1. Show that 


im { [2 4) az =0. 
μα [Ζ[Ξ-ρ D(z) | a 


In view of Theorem 19 what does that tell you about the number of zeros which 
p has inside large disks? 
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11. Show that 


f@ = 13 


maps the unit disk 1:1 onto the right half-plane Re z > 0. What is the image 
under f of the upper half of the unit disk. 


12. Find explicitly a 1:1 analytic map of half of the unit disk onto the full unit 
disk. Hint: Compose maps as in Figure 21. 


- 


FIGURE 21 


13. Prove the open mapping theorem directly from the strong maximum princi- 
ple and the fact that the reciprocal of a non-vanishing analytic function is 
analytic. 


*14. (Rouche’s theorem) Let f and g be complex-analytic on the closed disk 
|z|< p and suppose that |g(z)| <|f(z)| on the circle |z| = p. Prove that f 
and f + g have the same number of zeros on the disk |z| < p. 


5.8. Fourier Series 


On an interval of the real line, the attempt to expand functions in 
power series fails even for some very smooth functions. Of course, many 
of the functions with which we work regularly are real-analytic and their 
series expansions are important in a variety of problems. But, the class of 
real-analytic functions is extremely special, and that limits the applicability 
of power series. Now, we are going to talk a bit about an equally important 
type of series expansion—the expansion into a trigonometric (or Fourier) 
series. Such expansions can be obtained for every smooth function on an 
interval. Furthermore, expansions into trigonometric series are natural 
and important in many mathematical problems (and in many applications 
of mathematics). 

Suppose that fis a (complex-valued) function on the interval [—z, πὶ. 
We ask if it is possible to expand fin a series 


(5.42) f~= > ce. 
Such an expansion would be equivalent to 


fe) -Ξ, ες. -Ἐ > (a, cosnx + ὁ, sin nx) 
n=1 
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where 
Q,—=C, + Cer 
b, = i(c, — C-,): 
We shall continue to work with the exponential functions. 

In the presence of absolute convergence, there can be no ambiguity 
about the meaning of a Fourier expansion (5.42); however, to allow for 
more general situations, we should stipulate the order in which the terms 
are to be summed. As one might guess, what we have in mind is to form 
the partial sums this way: 


s(x) = 3 δι" ἘΣ, 
κι---π 
The meaning of (5.42) is then that f(x) = lim s,(x). 


If any Fourier expansion is to be valid, it is quite easy to see which 
coefficients c, we should use. If, for instance, (5.42) were uniformly con- 
vergent, we could legitimately compute 


I . -ikx — = Η͂ 1 : i(n—k)x 
on | ο΄ FxF(x) dx = p>) ὦ, "πη is 6 dx 
= C.. 
because 
i ei'a-kh)x dy — 0, n#k 
2n J _, l, n=k. 


Therefore, if we start with a Riemann-integrable function f, we define the 
Fourier coefficients of f to be 


(5.43) ἢ, = = | f(xje** dx,  n=0,+1,+2,... 


and we call the formal series 
(5.44) be ae 


the Fourier series for f, Then we ask: 


(i) Does the Fourier series converge? 
(ii) If it converges, does it converge to 77 


We are going to limit the discussion to piecewise-continuous functions 
although the results are valid more generally. 

There are some elementary observations we should make. We are 
attempting to expand fin a series of functions each periodic with period 
2x. Hence, if we want to have the series converge to fat each point, we had 
better assume that f(—z) = f(z). Of course, we would not expect the 
Fourier series for a piecewise-continuous function to converge to f(x) at 
each point, since the values of fat finitely many points could be changed 
without altering the Fourier coefficients c,. Thus, question (ii) above should 
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be reworded along the lines of: If the series converges, does it converge to 
fat most points? 

Before we discuss the two basic questions (i), (ii) for the general 
continuous (or piecewise-continuous) function, let’s talk about smooth 
functions f. For a function f of class C!, which satisfies f(—2) = f(z) it is 
not too difficult to see that the Fourier series converges uniformly to Κὶ 
In order to prove that, we first show that the Fourier coefficients tend to 0. 


Theorem 22 (Riemann-Lebesgue). Let f be a piecewise-continuous func- 
tion on the interval [a, b]. Then 
lim ” fxetx dx = 0. 
ti-7% a 
Proof. The function f is essentially the sum of a finite number of 
functions, each of which is uniformly continuous on an interval and 0 
outside that interval. So, it should be clear that we need only prove the 
theorem for continuous functions Καὶ Any continuous function f can be 
uniformly approximated by step functions, that is, functions which are 
piecewise constant. Why? Let ε > 0. By the uniform continuity of /, 
there exists ὃ > 0 such that 


| f(x1) —f(x2)| « ε, |x; — x,|< ὃ. 
Choose a partition a= x) <x, <--: <x,=O5 of [a,5] such that 
Xp — Xp-1 < 6,k = 1,..., 27. Define 


s(x) = f(Xe-1), Xp SX <q. 
(See Figure 22.) Then s is piecewise-constant and is uniformly within ε of f: 
[f(x)—six)|<6, axx<b. 


| 
| 
Ι 
] 
| 
| 
b 


FIGURE 22 
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Therefore, it suffices to prove the theorem for step functions. We have left 
the proof of that sufficiency to the exercises. 

But now the proof is easy, because each step function is a finite sum 
of functions which are constant on an interval and 0 outside that interval. 
Thus, all we need to do is to prove the result for constant functions, i.e., 
to show that 

lim ᾿ dx = 0. 


|t]-00 Ya 

That is easy to see, since the integral can be calculated explicitly: 
: 1 

| eit ax ἜΡΙΣ: ae 


ὃ 


" 


a@ a 
2 


b 
e* dx 
[ἀπῇ 


Corollary. If f is piecewise-continuous, then 


= 


lim ἘΠῚ cos tx dx = lim ; f(x) sin tx dx = 0. 


| t | 00 a [ t | 00 


The Riemann-Lebesgue theorem is valid for many more functions 
than piecewise-continuous ones. It is valid for Riemann-integrable func- 
tions as well as ones which are “integrable” in the sense of Lebesgue 
(Chapter 7). What interests us now is that the Riemann-Lebesgue theorem 
shows that the question of whether or not the Fourier series for fconverges 
at the point x depends only upon the behavior of fnear x. 


Lemma. Let f be a piecewise-continuous function on [---π, π]. The nth 
partial sum of the Fourier series for f is 


(5.45) s(x) = a |. f(x — pina ay ay, 


sin 4t 
Proof. We have 
S,(X) = > cer 


k=-—n 


=> ee a | fe dt 


k=—n 


=p | 7 f()D,(x — ὃ dt 
where 
D(x) = > εἰκα͵ 


Extend f to ἃ piecewise-continuous function on the real line in such a way 
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that f(x + 2x) = f(x). (Forget about the values at z and —z. They don’t 
affect the integrals we’re considering.) The substitution u = x — ¢ yields 


50) = 35 [Fl — wD, ds 


and since both fand D, are periodic with period 2z 


s,(x) = ας | "floc — wD) du 


The rest of the lemma is a calculation: 


D(x) = et 
oo a 
k=0 k=~-1 
eilnt ix ΜΝ i e iat lx ΝΙΝ 1 
Sagi or pee 


__ cosnx — cos(n + 1)x 
ee Ι — cos x 
_ sin(n + 4)x. 
sin $x 
The sequence { D,} is called Dirichlet’s kernel. 


Theorem 23. Let f be a piecewise-continuous function of period 2n on 
the real line. At any point x where f is differentiable, the Fourier series for 
f converges to f(x). 

Proof. The result is stated for periodic functions on the real line in 
order to be clear about what is meant here by differentiability at the end 
points of [—z, z]. If fis a function on [—z, z] and if we extend / periodical- 
ly by f(x + 2x) = f(x), then the extended function is differentiable at z 
exactly when f(z) = f(—z) and f’(x) = [{{--π). This assumes more than 
differentiability of the corresponding function on the interval [—z, z]. 

Fix a point x where fis differentiable. We have 


(5.46) ΞΕ = | Ε f(x - DD,0) dt 
where 


sin (n + 4)t 
Dt) = sae 


If we apply (5.46) to the constant function 1, we have 


cata, 
(5.47) [5ΞΞ 55 | 7 D,(t) dt. 
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Multiply (5.47) by the constant f(x) and subtract from (5.46): 
] π 
si) τοῦ τας [γα -- ῦ — FOOD. at 
τὸ 1 1 f@—)—f@) 
55 | = ; tD,(t) dt. 


Since f is differentiable at x, the difference quotient 


γα - ἢ —f&) 
ΐ. 


(5.48) 


is piecewise-continuous for —z < t < a (with a suitable definition of its 
value at 0). Now 


tD,(t) = sin (nm + 4)¢. 


sin 42 
The first factor on the right is piecewise-continuous because it has the 
limit 2 at t = 0. Thus, 


s(x) — f(x) = - | - A(t) sin (n ki 2} dt 


where / is piecewise-continuous. By the (Corollary of the) Riemann- 
Lebesgue theorem, 


lim [s,(x) — f(x)] = 0. 


It is not too difficult to strengthen the proof of the last theorem to 
show that, on any closed interval where 715 of class C', the partial sums s, 
converge uniformly to Κὶ In Exercises 10-13, we have indicated one way to 
do that. We shall obtain the result later as a simple consequence of a more 
general convergence theorem. 

The hypothesis in Theorem 23 can be weakened somewhat, e.g., if f 
is continuous at x and of bounded variation near x, then s,(x) converges 
to f(x); however, the smoothness condition on fcannot be weakened too 
much. If fis merely continuous at x, the Fourier series need not converge 
at x. That is disappointing but not a disaster. One of the oldest problems in 
the study of Fourier series was solved only a few years ago when the 
Swedish mathematician Lennart Carleson proved that the Fourier series 
for a continuous function (or even an integrable function) converges to the 
function “almost everywhere”. The proof is not too far beyond us concep- 
tually; but, it is much too long and difficult for us to present. 

Prior to Carleson’s result, and presumably this will continue to be the 
case, the attitude had been this: If the Fourier series of a continuous func- 
tion need not converge to the function at all points, then let us look for 
other ways to recapture the function, given its Fourier series. One 
method of doing that is the following, known as the Cesaro summability 
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method. Form the averages 


(5.49) σ, =—(So + τ. + 1): 


These functions are called the Cesaro means of the Fourier series for f. It 
is a theorem that, if f is continuous and f(—z) = f(z), then the Cesaro 
means σ΄, converge uniformly to Καὶ We shall not stop to prove this, because 
in the next section we shall prove the analogous result for a summation 
method known as Abel-Poisson summation. It is a much more illuminating 
method. 


Exercises 


1. What is the Fourier series for 
F(x) = sin x cos 2x? 
(Think. Don’t compute any integrals.) 


2. Let f be a function of class C2 on [—z, 2] with f(—7z) = f(7). Use integra- 
tion by parts to show that 


T lel «- οο. 


3. If fis of class C2 and f(—7) = f(z), then the Fourier series for f converges 
uniformly to καὶ 


4. True or false? If fis of class C! and f(—2) = f(z), the Fourier series for 7. 
is obtained from term-by-term differentiation of the Fourier series for Καὶ 


5. If fis an even function, f/(—x) = f(x), then its Fourier coefficients satisfy 
C_, = C, and so the Fourier series is really a “cosine series”. What about odd 
functions, f(—x) = —f(x)? 

6. Let { f,} be a sequence of piecewise-continuous functions, for each of which 
the Riemann-Lebesgue theorem is known to be valid. If f, converges uniformly 
to f, show that the Riemann-Lebesgue theorem is valid for f. 


7. Show that in Exercise 6 it would be enough to know that 
lim |" if, —f|=0. 


8. Use the result of Exercise 7 to (define the objects in and) prove the Riemann- 
Lebesgue theorem for functions which are piecewise-continuous in this sense: 
There is a partition of [—z, 7] such that fis continuous and bounded on each 
open interval defined by the partition. 


9. Use Exercise 8 to show that, if f is piecewise-continuous on [—Z, 77] and 
satisfies a Lipschitz condition at x: 


If) -—fO)|<M|x—yl, — ally 
then s,(x) converges to f(x). 
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10. If fis a function on the real line and if t € Κα, the ¢-translate of fis the func- 
tion f, defined by 
f(x) = f(x + ἢ. 


If fis piecewise-continuous, show that translation of fis continuous in this sense: 
ὃ 
lim [| f0) -- πὴ} ἀν = 0. 


Hint: It follows from uniform continuity in case f is continuous; and, if f is 
piecewise-continuous, there is a continuous g such that | |\f—e|<e. 


11. Let f be piecewise-continuous and let 0 < ὃ < z. Show that 
lim { " f()Dy(0) dt = 0. 
12. Let fbe piecewise-continuous on the real line and let 0 < ὃ < 2. Show that 
lim { "f(x — )D,(t) dt = 0 


uniformly for x € [—z, 2]. Hint: Use Exercises 10 and 11. 


13. If fis of class C!, show that the Fourier series for fconverges uniformly to f. 
Suggestion: You need to show that 


fim [ FERAL sin (n +5)tat —0 


uniformly in x. Choose 6, 0 < ὃ < 2, so that the difference quotient is within € 
of f’(x) for |t| < ὃ (and for all x). Break up the integral into the integral over 
[—6, 0] plus what’s left over. Apply Exercise 12 to the left-overs. 


*14. The series 


Ι 


—einx 
ππὸ ἢ 


is the Fourier series of ἃ piecewise-continuous function. Find that function. 


*15. Prove that the series 


co 
A nx 
n=l Nn 


is not the Fourier series of a piecewise-continuous function. 


5.9. Abel-Poisson Summation 


Suppose that fis a complex-valued function on the interval [—z, z] 
and f(—z) = f(a). Then f may be regarded as a function on the unit 
circle in the complex plane: 


(9.50) fe" -- (9, --πΞθ-:,π. 


If f is [piecewise] continuous on [—z, z], then the corresponding function 
on the circle is [piecewise] continuous. There is nothing much to that; 
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however, there is an important point here. When we think of fas a function 
on the unit circle, it suggests some things to do with f that otherwise we 
might have missed. In particular, it suggests a strong connection between 
Fourier series and complex power series. That connection will lead us to 
the Abel-Poisson summation method for Fourier series. 

Let’s begin with a very special situation. Suppose that fis ἃ complex- 
analytic function in a disk of radius greater than | about the origin: 


(5.51) fo=Naz, [2|-1-ε. 
n=0 
Now 
flre®) = Σ᾿ απο, O<r<lte. 
n=0 


In particular, the restriction of f to the unit circle is a continuous function 
on that circle, and the Fourier series for that function is staring us in the 
face: 


(5.52) f(e®) = Σ᾽ a,e9. 
n=0 
In fact, for each r we have a function f, on the unit circle 
(5.53) ffe®) = fre) 
and the Fourier coefficients for f, are 
1 {* a,r", n=O 


an J _, 


-in8 f (@'9) dQ = 
ee ade OPT 


Since fis continuous on |z| < 1+ ε, 
lim f, =f 
r-1 


and the convergence is uniform on the unit circle. 

Here is the idea of Abel-Poisson summability. Start with any piece- 
wise-continuous complex-valued function fon the unit circle. Let c,, be the 
nth Fourier coefficient of f. Define 


οο 


(5.54) ἀρ = SY c,r'"en®, O<r<l. 


Why does this make sense? The sequence {c,} is bounded: 


lc, | < sup {| f(z) | 12] = 1}. 


Clearly then, if r < 1, the series in (5.54) is uniformly absolutely conver- 
gent because 


yrt=w—nt-1, O<r<h. 


Now, we’re going to prove that (as r—> 1) f, converges to f (at most 
points). First, we obtain an integral representation for ἔς: 
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f(e'®) = Σ᾿: γί"! οίπθς x | (οἰ). 1 αἱ 


ἘΣ i | fea γ 51 οἰπίθ--:)) dt. 
Thus we have 
(5.55) 


f(e%) =x | fle)PAO -- ἡ αἱ 
where 


P(@) = Σ pial ind 
—_ > rrein? + Σ re” ind 
(5.56) fe ea 
= 5. (ὁ a a ς a) | 
~ | — re? * 1 — γε 
] —r? 


~ 1 —2rcos 6 + r2 
The family of functions {P,} is known as the Poisson kernel. That 
family of functions has these properties: 
rae ie iis sesh 

ὦ | PAO) ad =I: 
(li) P, > 0; P,(—@) = PA); 


(11) 1[0 « ὃ «: π, 
then 


lim sup {P,(0); ὃ <|0| <x} =0. 


FIGURE 23 


Property (ii) is obvious. No calculation is necessary in order to verify 
property (1). It results from (5.55) applied to the constant function f = 1. 
What does (iii) say ? It says, fix a small neighborhood of 0. On that part of 
[—z, a] which is outside that neighborhood, P, goes uniformly to 0 as r 
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tends to |. On the other hand, P,(0) tends to oo as r tends to 1, and so the 
picture is something like that indicated in Figure 23. 
In order to carefully verify (11), observe that 


P@) <7 τοῖς 


ΤΣ οσεῦτρ ὁ 915: π' 


Theorem 24. Let f be a piecewise-continuous function on the unit circle 
with Fourier coefficients {c,}. At each point εἶθ where f is continuous, the 
Abel-Poisson means 


(εἰ) = > cr! σοι 
n=—oc 


1 ([" 
-- π-- f(e')P,(@ -- ὃ dt 
# | eR --ὦ 
converge to f(e'®) as τ tends to 1. The convergence is uniform on each closed 
arc on whicn f is continuous. 


Proof. It is technically convenient to (again) regard fas a function on 
the real line with period 27: 


S(O) = fle"). 
Then 
F108) = 32 | SPO — δαὶ 
; = - | - f(@ — t)P,{t) dt. 
Thus 


FO) -- $48) = ας | (F0) -- ΤῸ — ΠΡΟ at 


10) —L.O\<x | 11) -- ΚΘ -- | PLO at 


(We just used property (i) and property (ii) of P,.) 

Let € > 0. If fis continuous at 8, we can choose ὃ > 0 such that 
(5.57) | £0) —fOl< -. Ιθ —t|<o. 
Then 


ι Γ᾿ 1 
10) LO) < x. [ 11) — τῷ -- oi Poa + ἃ [ 


Jt|>6 
ε I 
<$ ta] SOO 91 Pande 


<+ + Ksup {P,();6 <|t|<7} 


where 


Ι [5 
Ka 2.5 a | |F@lae 
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From property (11) of P, we have 
| £0) — f,(0)| < ε, r near 1. 


If fis continuous on the arc a < θ᾽ < b, then uniform continuity al- 
lows us to choose ὃ so that (5.57) holds uniformly in θΘ. Our inequalities 
then show that ἢ, converges uniformly to f on that arc. 


Corollary. If f is continuous on the unit circle and 
5 τ") 49. -- 0, ἢ -- 0,-+1,42,..., 


then f = 0. 


Corollary. If f is continuous on the unit circle, then f can be uniformly 
approximated by trigonometric polynomials 


N 
p(e’”) —_— >, ἀ, 
n=-N 


Proof. Choose r so that f, is uniformly near f. Then take a partial sum 
of the series for /;: 


N ° 
pa crite, 
n=-~-N 


Corollary (The Weierstrass Approximation Theorem). Let f be acon- 
tinuous complex-valued function on a closed interval [a, δ] of the real line. 
Then f can be uniformly approximated by polynomials. 


Proof. Evidently we may assume that the interval is [a, z], where 
—nm <a< x2. Extend f to a continuous function on [—z, 2] such that 
f(—2) = f(x). (Just use a straight line.) If € > 0, there is a trigonometric 
polynomial 


N ° 
50) = Σ᾿ eye 


with || f— g||.. < e. Each e'"* can be expanded in a power series in x. 
Thus, there is a polynomial 


Nn 
PAX) Ἐπ a Any x* 


with complex coefficients so that [εἰν — p,(x)| is uniformly small on [—z, 
π]. The polynomial 


N 
P(x) = Dy en Pal) 
will be uniformly close to αὶ 
Theorem 25. Let f be a piecewise-continuous function on the interval 


[—z, π]. Let c, be the nth Fourier coefficient of f and let s, be the nth partial 
sum of the Fourier series for f. Then 
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: Ξο ] 4 
2— + 2 
ὦ Σ lek = ge] [Moor dx 
(ii) lim if " 1£(x) — s,(x) |? dx = 0. 
Proof. Let r < 1. From the uniform convergence of the series 


f(e'®) — » c,ri*leiné 
ἯΞΞ-- oO 


it is easy to compute that 
Ι ᾿ ἐθλ [2 =o Ϊ ᾿ ἐθλ. ἐθλὴς 
| fe )| dé a 2π zc le ) d6 


= lee 
πμ-Ξ: -- οὐ 


Now, let r tend to 1. If fis continuous, then f, converges uniformly to fand 
so we obtain statement (i) for Καὶ But that argument works equally well for a 
function f which is continuous on a < @ < ᾧ and is 0 off that interval. 
Since a piecewise-continuous function is the sum of a finite number of such 
functions (disregarding the values at a finite number of points), we have 
the relation (i) for each piecewise continuous αὶ 

To prove (11), just calculate that 


i | If@)—scohdx= 2 | 1foPax— Σ lel 


Then apply (i). 


Corollary. If f is a function of class C! on the unit circle, the Fourier 
series for f converges uniformly to f. 


Proof. We have 
70) =fO+ | fat. 
Then integration by parts shows that the mth Fourier coefficient of f’ is 
1 5 , ~inx a In ᾿ —inx 
55 [ {(χ)ε "πὸ dx = 55 [16 dx 


= INC,. 


Therefore, the nth partial sum of the Fourier series for f’ is s\, the deriva- 
tive of the nth partial sum of the Fourier series for Καὶ So 


f(x) — s(x) = FO) — 5,0) + | LO — si(O) αἱ 
$0) — 2) <1 £0) — 5,1 + [1ΧῸ — 5,(0)| a. 


We know that s,(0) converges to (0), by Theorem 23. We also know that 
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lim f ᾿ fit) — sift)? at = 0. 


The result now follows from the Schwartz inequality: 


ΕΞ Ι- is (5 [ise) 


We have left that inequality for the exercises. 


Exercises 


1. Let f be a piecewise-continuous function on the unit circle. Suppose that all 
of the Fourier coefficients of f are non-negative (real numbers). Prove that the 
Fourier series for f converges uniformly to ἡ 


2. For piecewise-continuous f and g, define 
l π 
<fE> = oF | S(x)g(x)* dx. 


(a) Show that <., .> is a pseudo inner product on the space of piecewise- 
continuous functions. That means that it satisfies the conditions 


(i) <f+e,h4>=K fh τὸ 5,8) 
(li) <a, f> =<fe>*; 
(iii). < fi, f> = 0. 


(b) Show that the Cauchy-Schwartz inequality holds: 
᾿Ξ ΧΡ... 
(Consult the proof for Euclidean space.) 


3. Use Theorem 25 to show that, if fhas Fourier coefficients a, and g has 
Fourier coefficients ῥ,. then 


(her= Σ abe. 


(See Exercise 2 for the notation < f, g>.) 


4. If fand g are piecewise-continuous functions on the unit circle, define the 
convolution /* g by 


(fx g(x) = - | f(x — t)g(t) dt. 
Show that ; 


(a) [Ἐς -Ξ-  β ἘΞ, 

(0) fa(g τ τ ῖκ ας τ επ, 

(c) if f has Fourier coefficients a, and g has Fourier coefficients 5,, then 
f * g has Fourier coefficients a,5,. 


5. If fand g are piecewise-continuous functions, then the Fourier series for the 
convolution ἔπ g converges uniformly to f* g. (Hint: Use Theorem 25 and 
Exercise 4(c).) 
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6. Refer to the convolution multiplication defined in Exercise 4. If there were 
an identity element for that multiplication (e « f = f for all f), what would the 
Fourier series for e have to be? (Use Exercise 4(c).) Is that the Fourier series of a 
function ? (See Exercise 1 or Theorem 25.) 


7. If fis a function on the real line, you know what the f-translate of / is: 
f(x) = f(x + ἡ. 


What is the relation between the Fourier coefficients of fand those of the func- 
tion f,? 


8. If f has Fourier coefficients c,, translate the coefficients by k: 
An = Cin 
Which function has {a,} for its sequence of Fourier coefficients ? 


9. Use the result of Exercise 8 to show the following. If some (sort of a) function 
Jf had the Fourier series 


Ν einx 
then that function would have to satisfy 
(i) f(x) = 0, x 40; 


(ii) im | το δὲ ἵ. 


Dirac called that function the “delta function”. Obviously no such function exists 
in the sense of our definition of function. But, find a function of bounded varia- 
tion F such that 
1 κ 
2π 


—n 


einx dF (x) = 1, all 7. 


10. Look at Figure 23. Does it look like the Poisson kernel P, converges (as r 
goes to 1) to the mythical delta function of Exercise 9? What is the convolution 
of f with the non-existent delta function? In view of those things, does it seem 
plausible that f* P, converges to fas r tends to 1? 


11. Let f be a function which is continuous on the closed disk |z| < 1 and is 
complex-analytic on the open disk |z| < 1. Then f defines a function on the unit 
circle |z| = 1 and that function satisfies 


ΕΞ Ι!  (οιθγρὶπθ 40 = 0, ΕΞ ΣΙ a 


Prove that, if we are given a continuous function on the circle which satisfies that 
condition, it can be extended to a function which is continuous on the disk 
|z| <1 and analytic on |z| < 1. 


12. Which continuous functions on the closed unit disk | z| < 1 can be uniformly 
approximated by polynomials p(z)? Compare with the Weierstrass approxima- 
tion theorem. 
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13. We mentioned the Cesaro means of the Fourier series for ἢ: 


σ, = (50 +51 10. + Spt), ΞΕ ΟΣ ee ne 
Show that 


1 x 
0 (x) ae 5π [ SO)KAx " t) at 
where K,, is the nth Cesaro mean of the formal series 


> einx, 


Verify that the sequence {K,} satisfies the conditions analogous to properties (i), 
(ii), and (iii) of the Poisson kernel (positive, integral 1, etc.). From those prop- 
erties show that the Cesaro means for a continuous f on the circle converge 
uniformly to καὶ 


The Dirichlet Problem 


Recall that a (complex-valued) function u on an open set in the plane 
is called harmonic if it is of class C? and satisfies Laplace’s equation: 
Ou , Oru 
Ox? δ» 
We came across such functions in our work with complex-analyticity. We 
did not say much aboyt them, except to note that an analytic function is 
harmonic and hence, that its real and imaginary parts are harmonic as well. 
Harmonic functions are important in their own right, for a variety of 
reasons. Thus, we shall devote this little section to a few of their fundamen- 
tal properties. The discussion will also help to round out our brief treat- 
ment of power series and Fourier series. 


Theorem 26 (Mean Value Property). Let u be a harmonic function on 
an open set D in the plane. If the closed disk {z;|\z — Z)| < p}is contained in 
D, then 


u(Zy) = x | u(Z, + pe’) dé. 


Proof. As usual, we assume that z, = 0, for convenience. Since u is of 
class ΟΣ, it has continuous radial and angular derivatives of the second 
order (except at the origin). A simple calculation shows that Laplace’s 
equation in terms of such derivatives is: 


0 (ou Ou 
(5.58) r Φ (9) = 0. 
Define 


10) =r {> $8 a8, O<r<p. 
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By Laplace’s equation (5.58), 
δ ΟὃΘ [1 ("a 


Thus /(r) is constant, and since obviously 
lim I(r) = 0 
r—0 

we have /(r) = 0. In other words, 


(5.59) oH a9 30 ἤρε 


Now use that fact to compute that 


be = 1 [|| du 
J | _ wipe) dO = u(0) Ἐὰ [ἢ Siar «9 


— (0) +z $2 a8 | ar 
= u(0). 


Corollary (Weak Maximum Principle). Let V be a bounded open set in 
the plane. Let u be a function which is continuous on V and harmonic on V. 
The maximum of |u(z)| over V occurs on the boundary of V. In case ἃ is 
real-valued, both the maximum and the minimum of α(Ζ) over V are attained 
on the boundary of V. 


Proof. This follows from the mean value property, just as it did for 
analytic functions. 


A particular consequence of the maximum principle is that a harmonic 
function on V which vanishes on the boundary of V is 0. Thus, each har- 
monic function is determined by its values on the boundary. The Dirichlet 
problem for an open set V is this. Suppose that we are given a (continuous) 
function on the boundary of V. Can we find a harmonic function on V 
which takes on those values on the boundary? The question should be 
made a little bit more. precise. It turns out to be somewhat complicated for 
general open sets V; but, when V is a disk, we can handle it nicely. In fact, 
we did that in the last section without saying so. Let us restate the result in 
the present context. 


Theorem 27. Let f be a continuous function on the unit circle. There 
exists a function F which is continuous on the closed unit disk, harmonic on 
the open unit disk, and which coincides with f on the unit circle. That function 
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F is unique, and it is given by the Poisson integral formula: 
P 1 : 
10θ it 
F(re’) = 51 1," [(6}.(θ — Ὁ dt. 


Proof. As we just remarked, if there is such a function F, there is only 
one. Define F on the open disk by the Poisson formula and define F(e’*) = 
f(e”). According to Theorem 5.23, 


lim F(re’®) = f(e”) 


uniformly in @. From that it is clear that F is continuous on the closed unit 
disk. 

Why is F harmonic on the open unit disk? There are several ways to 
see that. One is to compute that F satisfies Laplace’s equation by differen- 
tiating under the integral sign. The harmonicity derives from the fact that 
the function P(r, 6) = P,(@) is harmonic. But, let’s observe that F is har- 
monic by this method. We have 

tO 
Ρ(Θ-- ὃ -Ξ- Re| 5+ Πές]: 


re’? 
Suppose that fis real-valued. Then 


F(z) = Re A(z) 
where 


me) =x | fen at 


Obviously A is complex-analytic on |z| < 1. So F is the real part of an 
analytic function and is therefore harmonic. 


Corollary. A real-valued function is harmonic if and only if it is locally 
the real part of an analytic function. 


Proof. 1 uis a real-valued harmonic function on an open set V and if 
Zo Ε V, let D be any disk which contains Z, and lies in V. By the proof of 
the theorem, there is a function A, analytic on D, such that u(z) = Re A(z), 
ze D. 


Corollary. Let V be a connected open set in the plane and let u be a 
harmonic function on V. If u vanishes on a non-empty open subset of V, then 


uO, 


Proof. We may assume that wu is real-valued. Let D be the union of all 
open sets  -- V such that u(z) = 0, z € U. Then Dis a non-empty open 
subset of V. We shall show that D is closed relative to V. Since V is con- 
nected, that will prove the result. 

Suppose w is a point in V which is in D. Choose an open disk W so 
that w ¢ Wc V. From the definition of D and the fact that w € D, it 
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follows that u vanishes on a non-empty open subset of W. But, on W we 
have u = Re A where ἡ is analytic. So A is pure imaginary (and therefore 
constant) on a non-empty open subset of W. Thus A is an imaginary con- 
stant. So u is zero on W. Conclusion: w εἰ ἢ. 


Corollary (Strong Maximum Principle). Let u be a real-valued har- 
monic function on a connected open set V. Then u cannot attain either a 
maximum or a minimum value on V unless u is constant. 


Proof. The mean value property shows that μ is constant in a neigh- 
borhood of any point in V where it attains a maximum or minimum. By 
the previous corollary, we are done. 


Corollary. Let u be a continuous complex-valued function on any open 
set V in the plane. Then u is harmonic if and only if u has the mean value 


property. 
Proof. It is a local theorem; i.e., we need only prove it on a disk. So, 
let us say that μ is continuous on|z|< 1 and has the mean value property 


u(29) = ὅ- | μίζο + pe) db 


on each disk |z — z)| < pcontained in the unit disk. Let v be the harmonic 
function on |z| < 1 which (is continuous on|z|< 1 and) agrees with u on 
the unit circle. Then, both u and v have the mean value property; con- 
sequently, uw — v has that property. Thus the maximum of |u — υἱ is at- 
tained on the unit circle, where u — v = 0. Conclusion: u = v, so u is 
harmonic. 


Exercises 


1. If f is complex-analytic and u is harmonic, then the composition uo f is 
harmonic. 


2. True or false? If u is harmonic on the unit disk and if there exists a sequence 
{z,} such that z, —> 0 and u(z,) = 0, then u = 0. 


3. Show that a function u on the unit disk is harmonic if and only if there exists 
a sequence of numbers c, such that 


u(re’®?) = SY c,r'*leiné 


n=—oo 


on the disk. 


4. Let μ be a harmonic function on the entire plane. Show that u = Re f, where 
fis analytic on the plane. 


5. (Liouville) Let u be a harmonic function on the plane. If μ is bounded above 
or bounded below, then μ is constant. 


239 


240 


Sequences of Functions Chap. 5 


6. Let 
u(re’®) = log r, rz~Q. 
Show that w is harmonic on the punctured plane. Does there exist f, analytic on 
the punctured plane, so that u = Re f? 


7. Let p(x, y) be a polynomial in two real variables with real coefficients. Some 
such polynomials are harmonic functions on the plane and some are not. True 
or false? If p(x, y) is harmonic, then p(x, y) = Re f(x + iy) where f(z) is a 
complex polynomial. 


8. (Harnack) Let u be a non-negative harmonic function in the unit disk. Show 
that 


1-—r 1 + 
l-+r 1 — 


9. Any uniform limit of harmonic functions is harmonic. 


μ(0) < u(re’”) < 


: u(0). 


10. (Harnack’s principle) Let {u,} be a sequence of real-valued harmonic func- 
tions on the connected open set D. Suppose that the sequence is increasing: 
Uy <u, <.---. Let 

u(z) = lim u,(z). 


Then either u(z) = co for every z € D or wis a harmonic function on D. Hint: 
See Exercises 8 and 9. 


6. Normed 


Linear Spaces 


6.1. Linear Spaces and Norms 


We are going to consider a type of “space” which is more general 
than Euclidean space. There are three reasons for doing so. 


1. The discussion will sharpen our understanding of what we have 
done up to this point. 

2. The terminology will enable us to formulate clearly and naturally 
some results which we have not yet dealt with. 

3. The more general concept will pave the way to a thorough dis- 
cussion of integration in the next chapter. 


Definition. A linear space is a vector space over the field of real num- 
bers or the field of complex numbers. 


Thus, a linear space consists of a set L, together with a vector addi- 
tion on L and a scalar multiplication of vectors by numbers. The vector 
addition is required to satisfy: 


1. Addition is commutative, 
X+Y=Y-4Y. 
2. Addition is associative, 
(X¥+ Y)4+Z2=X+(¥+4 2). 
3. There is a unique vector 0 (zero) in L such that 
0+ X= xX 
for all X in L. 
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4. To each X in L there corresponds a unique vector — Χ' in L such 


that 
X + (—X) = 0. 
The scalar multiplication is required to satisfy 
5. (ab)X = a(bX) 
6. ΪΧχ--χ 
and to be related to vector addition by the two distributive laws: 
7. a(x+ Yy=cX+cY 
8. (a+ b)X =aX + bX. 


In the case of a real linear space, we multiply vectors X by real numbers 
c; and, in the case of a complex linear space, we use complex numbers c. 
When arguments are being presented which are applicable to both cases, 
we shall refer to the field of numbers as the field of scalars. 

If we start with a linear space Z, each subset of which is “closed” 
under the linear operations in Z provides us with an example of another 
linear space: A (linear) subspace of 7, is a subset M such that 


(i) if X¥, Ye M,then(Y¥+ Y) eM; 
(ii) if X ε M, then (cX) € M for all scalars c. 


Nearly all of the linear spaces we meet will be spaces of functions— 
subspaces of the space of functions on a set. The richness of this class of 
spaces is one indicator of the fact that they form the natural settings for 
many questions in analysis. 


EXAMPLE 1. Let S be any set. The collection of all real- [complex-] 
valued functions on S forms a real [complex] linear space, with the (usual) 
operations 


(f + g)(s) = f(s) + als) 
(cf)(s) = ef (s). 


In the particular case S = R the basic space under discussion is 
the space of all real- [complex-] valued functions on the real line. Here is 
a decreasing sequence of subspaces: those which consist of the contin- 
uous functions, the differentiable functions, the functions of class C!,..., 
the functions of class C*,..., the functions of class C~, the real-analytic 
functions, the polynomials,..., the polynomials of degree at most 
n,..., the affine functions, the linear functions, the zero function. 

In case S is the interval [0, 1], another little decreasing chain of spaces 
of functions consists of the spaces of bounded functions, functions of 
bounded variation, functions which are continuously differentiable, func- 
tions which satisfy the homogeneous differential equation f” + f= 0. 

If S consists of the first » positive integers, S = {1,...,”}, a real- 
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valued function on S is an ordered n-tuple X = (x,,...,X,). Therefore 
the space of real-valued functions on S is the basic space R”. The sub- 
spaces of R” were described in Chapter 1. Each has a dimension k,0 < 
k <n, and can be defined by imposing (n — k) homogeneous linear con- 
ditions on the coordinates x,,...,x,. When n = 3, the subspaces other 
than R? and {0} are the planes through the origin and the straight lines 
through the origin. 

If S is the set of positive integers, the space of functions on S is the 
space of sequences of real [complex] numbers, X = (x,, x,,X3,...). We 
shall mention in subsequent examples some important subspaces of the 
space of sequences. 


Much of analysis deals with problems of approximation and con- 
vergence in linear spaces. Frequently the appropriate measure of approxi- 
mation is provided by the following type of function on vectors, which 
generalizes the idea of length. 


Definition. If L is a linear space, a semi-norm on L is a real-valued 
function || ... || on L with these properties: 


(i) |[ X|| > 0; 
(ii) |[X + Y[[<|]X|[+ |] YI] G@riangle inequality); 
(ili) |]eX || = |e] || XI]. 


A semi-norm is called a norm if it also satisfies 
(i)’ if || X|| = 0, then X = 0. 


A [semi-] normed linear space is a linear space L, together with a specified 
[semi-] norm on L. 


EXAMPLE 2. Euclidean space of dimension n is the normed linear 
space which consists of the linear space R”, together with the norm 


|X] = (xd + -τ. -Ὁ xR). 
This special norm |---| is réferred to as length. There are other norms 
on R" which are important and useful. Two of these are 
XU: = [χα] +--+ - [χα] 
|X ||. = max | x; |. 


(6.1) 


Let us mention a more general class of norms on Ἀπ. Let p be a number, 
Ι < p < oo, and define 


(6.2) XI, = (ai + +++ + |x|)”. 


We shall defer the proof that || - - - ||, is a norm, since to see that it satisfies 
the triangle inequality is somewhat more difficult than the verification in 
the special cases p = 2 (length) and p = 1. The notation for the maxi- 
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mum norm ||---|]., is explained by the result that 


lim |] X||, = max | x, | 
po k 
which we have left to the exercises. 


EXAMPLE 3. Complex Euclidean space of dimension n is the complex 
linear space C”, together with the norm 


JX] = (xn Ἔ τ. + LQ. 
The functions ||--- ||, as defined above each provide a norm on C”. 


EXAMPLE 4. The sup norm on functions entered into our discussion 
of uniform convergence. In that discussion, usually we were dealing with 
continuous functions; however we might as well talk about sup norms in 
the following context. Let S be any non-empty set, and let B(S) be the 
linear space of all bounded complex-valued functions on S. For fin B(S) 
define 


(6.3) Il FI}. = sup {| f(s) |; 5 Ἑ S}. 


It is a simple but important matter to verify that B(S), with the sup norm, 
is a normed linear space. 


(i) The norm || f||.. is defined as the supremum (least upper bound) 
of the image of | f |, that is, as the sup of the following set of non-negative 
numbers 

4, ={f(9)|38 ε S}. 
The point of dealing with bounded functions fis that the set A, is bounded 
and, accordingly, || f||.. << co. Note that A, is non-empty because S is; 
hence || f||.. > 0. 
(ii) If f and g are bounded functions on S, how do we show that 


IWF + gle SIS Il. + Helle? 


There are various ways to describe why this is so. Perhaps the simplest 
is this. If s € S, then 


[Οὐ + a)(s)| = [51ὁ + a(s)| 
<|f(s)| + Ia(s)| 
<I F Ifo + [le [le 
Thus || f ||. - ||_g||. is an upper bound for the image of | f + g| and is 


at least as large as || f + g]|.., the least upper bound. 
(iii) The verification that || cf||.. =|c|||f ||. will be left as an exercise. 


So || f ||. is a semi-norm on B(S). Why is it a norm? Well, suppose 
[| f ||. = 0. Then | f(s)| < 0 at all points s, that is, f(s) = Ο for every 5. 
If K is a compact set in R”, then one subspace of B(K) is C(K), the 
space of continuous complex-valued functions on K. (Remember it is 
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a theorem that a continuous function on a compact set is bounded.) This 
provides us with another example of a normed linear space, namely, C(K) 
with the sup norm. 


EXAMPLE 5. Let J be a closed interval on the real line. Let BY(J) 
be the space of complex-valued [real-valued] functions of bounded varia- 
tion on J. There is a natural semi-norm on BV (J), namely, total variation 


(p. 161): 
(6.4) lf || = VP). 
The properties of a semi-norm are easily verified from the definition of 


the total variation of a function. In this case, some non-zero vectors do 
have norm zero. In fact || f || = 0 if and only if fis a constant function. 


EXAMPLE 6. If J is a closed interval on the real line, let (1 (1) be the 
space of functions of class C! on J. A norm on C'(J) which is frequently 
useful is 


6.5) WF = WS Me +S Ile 


sup | f| + sup| Κ΄]: 


This verification is best carried out by noting that (1) the sup norm is 
a norm on C'(J), (2) f— || f’||.. is a semi-norm on C'(J), (3) the sum of 
a norm and a semi-norm is a norm. There is an analogous norm on C*(J), 
| Sa as eee 


| 


EXAMPLE 7. Consider L(R’, R*), the space of linear transformations 
from R” into R*. In Example 21 of Chapter 3 we discussed a norm on 
L(R’, R*): 

(6.6) [| ΤΊ] = sup {| 7(X) |; |X| = 1} 


Note that ||7'|| is the sup norm of the restriction to the unit ball of the 
function | 7 |. This makes it easy to verify that (6.6) defines a semi-norm. 
The fact that it is a norm results from the observation that, if a linear 
transformation is 0 on the unit ball, it is 0. 


EXAMPLE 8. Let J be a closed interval in R! and let PC(/) be the space 
of piecewise-continuous functions on J. Then PC(J) 15 a linear subspace 
of the space of all bounded functions on J. Therefore, the sup norm is 
a norm on PC(J). There is another semi-norm on PC(J) which has an 
important connection with integration: 


{71 -- [171 


The verification that this is a semi-norm is straightforward. We merely 
remark that it utilizes the fact that integration is linear. We can have 
| f ||. = O without f = 0. It is not difficult to see that || f ||, = Oif and only 
if f(x) = 0 except at a finite number of points. 
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Note that the integral semi-norm || f ||; is actually a norm on the 
space C(I), because a continuous function for which 


[151-- 
must be 0 on J. 
There is a class of semi-norms on PC(/) which are analogous to 
norms on Κ΄: 


(6.7) Iflb=(fise) > 1<p<o. 


EXAMPLE 9. Let p be a number, 1 < p < oo. Let €? be the space of 
all sequences Y = (x,, X2,...) of complex [real] numbers for which 


> |x, |? <“ oo. 
n 
On ¢? we have a norm 


(6.8) XU, = Coole)". 


The space of bounded sequences is denoted ¢” and has the norm 
(6.9) || X ||. = sup {|x, |; Ε Z,}. 


Our main point in mentioning these spaces ¢? is to strengthen the analogy 
(which we hope the reader has sensed) between Examples 2 and 8. In this 
Example (as in Example 8), concentrate mainly on the values p = 1, 2, οο, 
where the verification of the triangle inequality is easier. Note that, until 
one has verified the triangle inequality, it is not apparent that ¢? is a 
subspace of the space of sequences. 


EXAMPLE 10. Let K be a compact set in R”. A complex-valued func- 
tion f on K is said to satisfy a Lipschitz condition if there exists a con- 
stant c > 0 such that 


If) -fM Ξ εἰχ --ἰ  ἰῦυὶ, αΥ̓ εκ. 
The smallest such c is called the Lipschitz norm of /: 
— sup LO) —f/@)1. 
NF Il = sup |x - YI 


The space of functions which satisfy a Lipschitz condition on K is a 
subspace of C(K) and the Lipschitz norm is a semi-norm on this subspace. 
Of course || f || = 0 if and only if fis constant. 


EXAMPLE 11. This example will be a brief discussion of an important 
special class of normed linear spaces. If L is a real [complex] linear space, 
an inner product on L is a real- [complex-] valued function <., .> on 
L Χ L such that 


(i) <X, XD > 0; if <X, XD = 0, then X = 0; 
(ii) <Y, X= ἃ Y>*; 
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(iii) (εχ + ¥,Z> = ε(Χ, ΖΣ -" (Υ,ΖΣ. 

An inner preduct space is ἃ linear space L, together with an inner product 
on L. In any inner product space, we have the Cauchy-Schwarz inequality 
KX, YOP Ξ «, XOCY, Y>. 

It is trivial when ¢Y, Y> = 0, and when <Y, Y> 40, it results from 
applying the inequality 0< <X + ΟΥ̓́, ¥ + cY> to the case where c 

is the constant 


—— 3X, ΤΣ 
ΟὟ, YS 
It follows from the Cauchy-Schwarz inequality that 
(6.10) |X|] = <X, X>? 


satisfies the triangle inequality and is therefore a norm on L: 
[|X+ Y|P2?=<xX+ Y,xX¥+ Y> 
= €X¥,X>+ (X, Y4+<Y, 4+ <Y, Y 
= ||X|P + 2Re <x, Y> + || ¥|P 
< |X ]P + 2X TY UL + YIP 
= (XI + I] ΣΡ. 


The norm on an inner product space is a very special type of norm, 
because it satisfies the parallelogram law: 
ΠΧ Ἐ YI)? + [|X — Y|P = 2g Χ Σ -ἜΊΓΥ [8]. 

If we start with a normed linear space, and if the norm satisfies the 
parallelogram law, then the norm comes from an inner product, i.e., there 
is one and only one inner product on the space which is related to the 
norm by (6.10). In the real case, the inner product is obtained from the 
norm by | 


(6.11) <X, YD =4F0lX + Y|P —iX— YIP. 
In the complex case, the inner product 15 

4 
(6.12) «X,Y>= 4 Σ "|| X + ΡῪ |). 


In the examples we have discussed thus far, there were few inner 
product spaces. Of course, Euclidean space of dimension 7 is an inner 
product space. The space of continuous functions on a box B with the 


norm 
Wflh=(f ise) 


is an inner product space. The inner product which gives that norm is 


(6.13) <fis>=| fer. 
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The space {2 (Example 9) is an inner product space: 
(6.14) CX, YD = aXe 


Both it and the integration example are infinite-dimensional analogues 
of Euclidean space. 


Exercises 


1. Let & be a positive integer, and let S be the space of all sequences 
X= Xe Ke) 
of vectors in R*. Which of the following sets of sequences X are subspaces of S$? 
(a) all X which are bounded; 


(b) all X such that {X,} converges; 
(c) all X such that >) X,, converges absolutely; 


(d) all X such that | X,| converges to 1. 


2. True or false? The set of discontinuous functions is a subspace of the space 
of real-valued functions on [0, 1]. How about the set of convex functions ? 


3. Each semi-norm satisfies || X — Y || > || X || — || Y]]. 
4. On a linear space, the sum of a semi-norm and a norm is a norm. 
5. If || --- || is a semi-norm on L, then M = {X; || X || = 0} is a subspace of L. 


6. If ||--- || is a norm on the real linear space L and if it satisfies the paral- 
lelogram law 
IX + V2 + YX — Υ 3 = 20 Χ [12 + [ΠΥ [12] 
then 
XY YD =H X¥ + YIP -WUX—- ΚΥ [2] 
is an Inner product on L. 


7. By examining the proof of the Cauchy-Schwarz inequality, show that it is 
a strict inequality except in the case where the two vectors involved are linearly 
dependent (one is a scalar multiple of the other). 


8. In a normed linear space (ZL, ||--- ||), distance is defined by d(X, Y) = 
|x — ¥]| 
(a) Prove that, in an inner product space, the shortest distance between two 
points is (along) a straight line. 
(b) Give an example which shows that this property of straight lines fails in 
some normed linear space. 
9. Show that, on a closed interval of the real line, every continuously-differen- 
tiable function satisfies a Lipschitz condition. Prove that the Lipschitz norm for 
such a function is the sup norm of its derivative. 


10. Refer to the norms in Example 2. Prove that 
1 Χ11. = lim || X I. 
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11. If fis continuous on [0, 1], then 
lim ({.17}} 7 = sup 
p-*co 0 
12. Let C"(/) be the space of functions of class C” on aclosed interval J. Show that 
ae | 
IF =D lls lle 
k=0 Κ' 


is a norm on C*(Z) and that it satisfies || fg || < || f|| |g ||. 


6.2. Norms on R° 


We shall give a geometrical description of all possible norms on 
R’. This will make it easier to see that the functions ||---||,, 1 <p < ©, 
are norms on R”. Here is the basic fact. 


Theorem 1. Let ||---|| be a norm on Ἀπ. There exist positive constants 
k, K such that 
(6.15) k|X|<|IXI|<KIX], XeR* 

Proof. Let E,,..., Ε, be the standard basis vectors in R’: 


X a (x1, ee eg x,,) 
ἘΞ Σ XE, + on ς + ΧΕ, 
Then the triangle inequality tells us that 


{Χ|}-Ξ [xi] |] FE: || SS Sea Peale i: 
Let 
M = max||£; ||. 
j 


and we have 
IXI< MDL 
Each | x,| is dominated by the length of Χ΄, so that 
2 |x;|<n|X]. 


Therefore, if K = nM, 

(6.16) I XI|< K|X|, Xe μι. 

Thus, we have half of what we want to know. 

Now we want to prove that there exists k > 0 such that 

k|X|<||X|, Χε R’ 

Since [cX| = |c||X| and ||cX|| = |c|||X||, we need only find k > 0 so 

that 
KIXI<|IXI, [Χ|Ξ 1. 
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In other words, we wish to show that 
(6.17) 0 < inf {|| X||; |X| = 1} 
Now, by (6.16) ||---|| is a (uniformly) continuous function on R": 
[XI — IYI <x — YI 
< K|X — Y\. 
The unit sphere {X;|X| = 1} is compact. Hence the infimum of ||X|| 


over the unit sphere is attained at some point of the sphere, i.e., there 
exists Y with | Y| = 1 such that 


ΠΥ] = inf ff] X]]; |X] = 1}. 
Now || Y|| τέ 0 because Y + 0. That establishes (6.17), and we are done. 


Because of the fact that ||cX|| = |c||| ||, every norm on R” is deter- 
mined by its unit ball: 


B= {X;3||X|| < 1}. 
What does such a unit ball look like? First, B is an open set, because 
[|---|] is a continuous function on 5. Second, B is symmetric about the 
origin: If XY € B then (—X) ε B. Third, B is convex: If X, Y are in B, 
the line segment between X and Y is in B. Fourth, B is bounded: If 
X © Bthen}X|< 1/k (6.15). 


Theorem 2. Let B be a subset of R® which is bounded, open, convex, and 
symmetric about the origin. Define 


(6.18) ΠΧ. = inf {1 +0 and +x Ε BI 
Then ||--+|| is a norm on R® and B is the unit ball for this norm, B = 
{X3 || X]| < 1}. 


Proof. Let X be any vector in R". Consider the set of numbers 
Mx = \t3t>0 and ἶχε B| 


involved in defining the proposed norm of X. Then My, is a very simple 
type of set, because it has the defining property of a half-line: If t €¢ My 
and s > 1, then s € M,. For 


| See ἜΝ ᾿ Ἐν 
X= cy, where Y τ" C= Ὁ 
Since B is a convex set which contains the origin, the facts that Υε B 
and Q< c < 1 imply that ΕΥ̓͂ ¢€ B. 
It is important to show that M, is non-empty. Since 
lim cX¥ = 0 


¢70 
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and B is a neighborhood of 0, we know that cX € B for all sufficiently 
small scalars c. Therefore t © My for all sufficiently large positive num- 
bers t. So Μχ is a non-empty open subset of the positive real line. 

Now consider the proposed norm 


\| X|| = inf My. 
Since M, is non-empty, 
0<||X|| < οο. 


Furthermore, from the first two paragraphs of the proof, we know that 
Μ, is the open half-line 


My = {t;t > || XI}. 
Let c be a positive scalar. If t 0, then 
i 
t 
Hence, t Ε M, if and only if (ct) Ε M.,. Thus M.y = cM, and 
[|cX||=cl[X], ¢>0. 
Since B is symmetric about the origin, My = M,_y, and 
[|] -—X|| = {{ΧΊ]. 
We conclude that for all scalars c 
|| eX || = |e] ]| X]}- 


Now let us verify the triangle inequality. Given vectors X, Y and 
positive numbers s, ὦ 


1 
X = a (eX). 


1 3 1 t 1 
(6.19) a+ = (44) τ (F") 
Since 
5 t 
s+ ran stt. 


we see from (6.19) that the vector (5 + ὃ (X + Y) is on the line segment 
joining (1/s)X to (1/1) Y. Since B is a convex set, this tells us that ifs Ὁ My 
and t © My, then (5 + t) © My,y. Therefore, 


inf My.y < inf M, + inf My, 
that is, 
ΠΧ + ὙΠ|[ΞΞ {ΧΊ} -Ἐ Π| Vl. 


We have verified that || - - - || is ἃ semi-norm. To see that it is a norm, 
we use the fact that B is bounded. Let X be any non-zero vector. Since B 
is bounded, there is a positive scalar c such that εχ ¢ B. Then 1/c is a 
lower bound for the set M, so that || X|| 4 0. 

This completes the proof, except for the fact that B is the unit ball 
of the norm. We have left the verification of this to the exercises. 
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Corollary. Let p be a number, 1 < p < oo. Then 
IX, = (Σ mel)" 
is a norm on Κλ". 
Proof. Let 
B, ={X © R53 [x,P +--+ + 1x <1}. 
Since f(x) = x? is a continuous function on the positive real line, B, is 
an open subset of R”. Since 


lim x? = co 


x0 


B, is bounded. Clearly B, is symmetric about the origin. 
To see that B, is a convex set, we verify that 


Ix); x>0 
is a convex function. This is an immediate consequence of the fact that 
(since p > 1) its derivative is an increasing function. See Example 3 and 


Exercise 11 of Section 4.1.) 
All that remains to be done is to check that the norm on R’ associated 


with B, in Theorem 2 is |[---||,. If X € Κ΄ and t > 0, then (1/)X ε B, 
means 

Σ al ὡς | 

k 
or 


Ϊ 
- es tee « I. 


Therefore, 


Ι 


int {1 τ 6: Ε B,} inf {¢; 1? > S| x, )} 
k 


= inf {s'/?; 5 > p> | x |?} 
= Cu [xy 


It is sometimes useful to have in mind the relative shapes of the unit 
balls B,, 1 < p < oo. These are indicated in Figure 24, which also includes 
the limiting case B.,, which is the unit ball for the sup norm. 


Exercises 


1. Can the unit ball for ἃ norm on R? bea triangle? A pentagon? A hexagon? 


2. Each ellipse centered at the origin defines a norm on R2. Suppose the axes 
of the ellipse are the coordinate axes. From the equation of the ellipse, derive 
an algebraic formula for the norm on R2 of which (the interior of) the ellipse is 
the unit ball. 
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FIGURE 24 


3. If O < p < 1, show that the set 
B, ={X © Καὶ Σ lxxP « 1) 
k 
iS not convex. 


4. Use the corollary in this section to show that the sequence spaces €? in Exam- 
ple 9 are normed linear spaces. 


5. Use the corollary in this section to show that for each p, 1 < p < «, 


1.2» = (firey 


is a semi-norm on PC(B). See Example 8. 


6. Prove that, for p > 1, the function f(x) = x?, x > 0, is strictly convex, 
ie., ifx <yand0 <¢ <1, then 


f(x ἘΠῚ — ty) < f(x) + ( — Of). 


Use this to infer that the closed unit ball for the norm || 6Ὲ ||, on R” is “strictly 
convex”, in a sense which you define. 


7. True or false? A norm on R? satisfies the parallelogram law (comes from an 
inner product) if and only if its unit ball is (the interior of) an ellipse? 


8. Is there a relationship between “strict convexity” of the closed unit ball for 
a norm and the fact that the triangle inequality is strict when the two vectors 
involved are linearly independent? 


6.3. Convergence and Continuity 


In Chapters 2 and 3 we discussed convergence of sequences in R” 
and the related concepts of open set, closed set, compact set, continuous 
function, etc. All of these ideas can be discussed in much greater generality ; 
indeed, most of the results are valid in considerable generality. We are 
now going to describe that collection of ideas in the context of normed 
linear spaces. Most of the results are valid with precisely the same reason- 
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ing as was used in R”. We shall conduct a review of those results. Of course, 
we shall also point out those results which utilized special properties of R’. 

Let L be a normed linear space. If X ε L, the open ball of radius 
€ about X is 


Β(Χ; ε) = {¥ € L;||Y — ΧΙ < ε}. 


A neighborhood of X is a subset of L which contains some open ball about 
X. The sequence {X,} converges to X if every neighborhood of X contains 
X,, except for a finite number of values of 7. If {X,} converges to XY, then 
we write 

X = lim X,,. 


We have the elementary results that 
(i) X = lim X, if and only if 0 = lim || Χ — X, |]; 
(1) if X¥=lim X¥, and Y=limY, then ¥+ Y=lim(X,+ Y,) 


and cX = lim cX,,. 


The set U is open if it is a neighborhood of each of its points. The 
point X is a cluster point of the set K if every neighborhood of X contains 
infinitely many points of K. The set K is closed if it contains all of its cluster 
points. The set U is open if and only if L — U is closed. The family of 
open sets is closed under the formation of arbitrary unions and finite 
intersections. There is a complemented statement about closed sets— 
arbitrary intersections and finite unions. The closure of a set is the inter- 
section of all closed sets which contain it. The interior of a set is the union 
of all open sets which it contains. The set S is dense in the set Tif Sis a 
subset of 7 and the closure of S contains T. The set K is compact if each 
open cover of K has a finite subcover. 

Relative neighborhoods, relatively open sets, relatively closed sets 
in S are defined by intersecting the corresponding families of sets with 
S. The set S is connected provided S and the empty set are the only subsets 
of S which are both open and closed relative to S. 

Let LZ and M be normed linear spaces and suppose 


F 
D—> M, Di, 


If X € D, then F is continuous at X provided, for each neighborhood 
V of F(X), the inverse image F~'!(V) is a neighborhood of X relative to 
D. We call F continuous if F is continuous at each point of D. The follow- 
ing are equivalent. 


(i) F is continuous. 
(ii) If X € D, and € > 0, there exists ὃ > Ο such that 


{2} -- Ε(Χ Ξε, ΤΕΡ, ||T—X||<o. 
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(iii) If V is any open set in M, then F~'(V) is open relative to D. 
(iv) If {X,} is a sequence in D which converges to the point X Ε D, 
then F(X,) converges to F(X). 


Sums and compositions of continuous functions are continuous. 

The image of a compact [connected] set under a continuous mapping 
is compact [connected]. 

The function F 


F 
D—> M, DcL 
is uniformly continuous if, for each € > 0, there exists ὃ > Ὁ such that 
FX) -- FXII <e (|X -- Χα}} « ὃ. 


A continuous function on a compact set is uniformly continuous. 

Several of the results from Chapters 2 and 3 are conspicuously absent 
from the brief summary which we have just given. Primarily, the absentees 
are those results which depended upon properties of R" which are not 
possessed by every normed linear space. We want now to discuss the 
special properties of R’. 

First, let us remark that (for the bulk of Chapters 2 and 3) it did 
not matter which norm we used on ΚΠ. Theorem 1 shows us that all norms 
on R" define the same convergent sequences, hence, the same closed sets, 
the same open sets, the same continuous functions, etc. Therefore, what 
we want to ask is this. Among normed linear spaces, what distinguishes 
the spaces which consist of R"” with some norm? 

The linear space L 15 finite-dimensional if there exists in L an k-tuple 


of vectors E,,..., Ε, which spans L, 1.e., which has the property that 
every ΧΕ L is a linear combination of those vectors: 
(6.20) A= CE, + irs + Cr... 


If Z is finite-dimensional, we can always choose vectors £,,..., Ε,, which 
span L and are linearly independent. (The number nv is the dimension 
of L.) Via (6.20) we then have a 1:1 correspondence 


X <> (Cy, .- +5 Cn) 


between vectors X ε L and (all) vectors (c,,...,c,) Ε R’. The norm 
on L corresponds to some norm on R’. Thus, L behaves like R” with some 
norm. 

So, we see that, as we look back on Chapters 2 and 3, we should 
ask which of the results there made use of the fact that we were dealing 
with a finite-dimensional normed linear space. There were two principal 
results of that type. 


1. In a finite-dimensional normed linear space, every Cauchy se- 
quence converges. 
2. In a finite-dimensional normed linear space, every bounded se- 
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quence has a convergent subsequence Le every closed and bounded 
set is compact). 


The Bolzano-Weierstrass property (2) is more than a special prop- 
erty of finite-dimensional spaces, it is a characterization of such spaces. 
One can prove (see Section 6.5) that the normed linear space L is finite- 
dimensional if and only if each bounded sequence in LZ has a convergent 
subsequence. 

On the other hand, the completeness property (1) is enjoyed by many 
spaces which are not finite-dimensional. Completeness is a weaker property 
than is sequential compactness (2); however, completeness itself has 
consequences which are sufficiently important that we introduce a special 
name for spaces with that property. 


Definition. The normed linear space L is complete if every Cauchy 
sequence in L converges. A complete normed linear space is (also) called a 
Banach space. 


We shall have more to say later about special properties of Banach 
spaces. We should state now the two results from Chapters 2 and 3 which 
exploited the completeness of R’. 


(i) If Z is a Banach space and if 5) X, is an infinite series of points 


of X which converges absolutely, 
DIX | < 2 


then the series converges. 
(11) Let Z be a normed linear space, let M be a Banach space and let 


F 
D—~> M, DCL 


be a uniformly continuous function. Then F can be extended to a con- 
tinuous function on the closure of D: 


re ἢ 
D—~> M. 

EXAMPLE 12. Perhaps the first normed linear space which we met 
after R” was the space of continuous functions on a compact set in R*. 
Now, we may start with K, a compact subset of the normed linear space 
L. Let C(K) be the space of real- (or complex-) valued continuous func- 
tions on K. In many analysis problems, the “natural” norm on C(K) is 
the sup norm 


fll. = sup 7] 


because it measures when function values are uniformly close together: 
The norm || γ᾽ — g||.. is small if and only if | f(X) — g(X)| is small, 
uniformly in X. As we saw earlier, convergence in the sup norm 
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lf —Fall»e —> 9 


is precisely uniform convergence of { f,} to αὶ 

The normed linear space which consists of C(K) endowed with the 
sup norm is a Banach space. We repeat the skeleton of the argument, 
which should be familiar. Suppose {f,,} is a Cauchy sequence: 


lim |] fm — Sn loo = 9. 
Then, foreach X € Καὶ, the sequence of numbers { f,(X)} is Cauchy, because 
| fl X) — SX) | < | fn — Salle 


By the completeness of the space of real [or complex] numbers, each of 
these numerical sequences converges. Let f be the function on K defined 
by the rule 


ΜΑΣ = lim f,(). 
It is easy to see that { f,} converges uniformly to /: If 
lfm -- 2}. Ξε, mn>N 
then (taking the limit on m) 
lS Sollee, iN: 
The uniform convergence of f, to f tells us that fis continuous. Thus, we 


have shown that {/,} converges in the normed linear space L = (C(K), 


I] ++ Tle). 


EXAMPLE 13. Let us make two remarks about the special case of 
Example 12 in which K = [a, δ], a closed interval on the real line. First, 
the polynomial functions 

P(X) = Cy + yx + +++ τ ο, χη 


form a linear subspace of C([a, b]) and this subspace is dense in the 
Banach space (C, || - - - ||). This is merely a restatement of the Weierstrass 
approximation theorem: Each continuous function on [a, δ] can be uni- 
formly approximated by polynomials. 

Second, it is easy to show that the Bolzano-Weierstrass property 
fails for this Banach space. Let 


(6.21) 5) τ 5, a χε 8. 
Then the sequence of powers {g"} is bounded: 

le". = 1, n= 1,2, 3,40. 
but, it has no uniformly convergent subsequence because 
l x= 
0, axx<b. 


(6.22) lim g*(x) = 
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This pointwise convergence tells us that the only function to which a 
subsequence could converge uniformly is the function defined by the 
right side of Equation (6.22). Since this function is discontinuous, such 
uniform convergence is impossible. 


EXAMPLE 14. Let C'(J) be the space of functions of class C! on a 
closed interval, with the norm 


WAMU F lle + FTL 


This is a Banach space. Let us see why. Suppose { f,} is a Cauchy sequence. 
Since 
ip Sill Wie elle Fc Selle 


the sequence {f,} converges uniformly and the sequence { f,} converges 
uniformly. It follows from Theorem 5 of Chapter 5 that f= lim/f, 15 


differentiable and 
f' = lim f,. 


Since f/, is continuous and converges uniformly to f’ the function Κ΄ is 
continuous. Thus f © C'(J) and || f — f,||— 0. 

Now C'(J) is also a normed linear space using the sup norm. That 
normed linear space is not complete; it is (in fact) dense in C(/). 


EXAMPLE 15. Refer to the sequence spaces ¢”, from Example 9. These 
spaces are Banach spaces. Let’s see how we would verify that for the case 
p = 1. The space ¢' consists of the sequences 


X= (x1, X2, ari .) 
of complex numbers such that 
Xl = QU] x] < οο. 


Suppose we have in {1 a sequence {X,} which is Cauchy: 
lim || X,, — X, ||, = 0. 
If X,, = (%n1» Xn29 Xn3> + « -), then for each k the sequence of Ath coordinates 
{x,x} 15 a Cauchy sequence of numbers: 
[ene Mee |S lA = allie 
The completeness of the complex numbers tells us that for each k there 
exists x, © C such that 
lim Ra Xi: 
What we want to show is that the sequence X = (x,, x2,...) is in €! and 


that 
|| xX — Xiah —> 0. 
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To that end, consider the double sequence 


k 
Snk = Dy |Xnj- 
j=1 

For each ἡ, the limit on k of the sequence {s,,} exists: 

lim Snk = |X, [h:- 
For each k, the limit on 7 of the sequence {s,,,} exists: 

k 
(6.23) lim s,, = >) |x;|- 
j=1 


Furthermore, the convergence in (6.23) is uniform in k because {s,,} is 
Cauchy in n, uniformly in k: 


k k 
| Sink —. Sax | al 2 | Χο] - Du | ns 
j= 


j= 
k 
ae p> |X mi τς χ, | 


By the Moore-Osgood double limit theorem (Theorem 3, Chapter 5), 
k 
lim Σ᾽ |x,| 
n j=l 


exists and is equal to the limit on 7 of the sequence {|| X,,||,}. Thus X¥ € ¢! 
and 
1h, = limll Xp ll 


To see that {X,} converges to X in the norm of ¢', apply the same 
double sequence argument we just used to the sequence 


k 
tik = da |x; — ΩΝ: 
j= 


For each n, the limit on k of this sequence is || ¥ — X,||,;. On the other 
hand, 
lim lnk —= 0 


and this consequence is uniform in k. It follows that 
lim || X — X,, ||, = lim 0 = 0. 
n k 


The sequence space ¢? is called Hilbert’s space. It has become common 
to call any complete inner product space a Hilbert space. 


EXAMPLE 16. An important normed linear space is the one consisting 
of C(B), the space of continuous functions on a closed box in ἐπ, together 
with the norm 


1 τὸ [171 
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In this space, two functions, fand g, are close together when the integral 


1. -- αἰ 


is small. This is quite a different measure of approximation than that 
which is provided by the sup norm. Since 


| lf -- 4|-:| — gil. m(B) 


it is true that uniform convergence implies convergence in the norm 
|{-. {π|| however, we may have 


NS. Fell 20 


while {f,} fails to converge pointwise, much less uniformly. Indeed, it 
can happen that 


(6.24) 7.1}, τοῦ Ὁ and || f,|l.—> ce. 


The phenomenon (6.24) is illustrated in the case B = [0, 1] by the func- 
tions f,(x) = ./n sin nx or by the tent functions in Figure 25. 

The space C(B) is not complete relative to the norm ||---||,. For the 
case B = [0, 1], this is treated in the exercises. 

We close this section with a remark. The little review of convergence, 
continuity, etc., which we carried out at the beginning of the section can 
be carried out just as easily in the context of metric spaces. A metric space 


(on v0) 


FIGURE 25 
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is a set S together with a real-valued function d on S Χ S which satisfies 

(i) d(X, Y) > 0; d(x, Y), = 0 if and only if X¥ = Y; 

(ii) d(X, Y) = d(x, Y); 

(il) d(X, Z) < d(X, Y) + d(Y, Z). 
Such a function dis called a metric or distance function on S. Each normed 
linear space (ZL, ||---||) has a natural metric on it: d(X, Y) = || X — ΥἹ!. 
By analogy with the normed linear space case, it should be clear how one 
defines €-ball, open set, convergence and so on in a metric space. The 
interested reader can check to see that the basic results of Chapters 2 and 
3 are valid in that setting. About the only thing to be gained by using metric 
spaces rather than subsets of normed linear spaces is that to a certain 
extent one avoids relatively open, relatively closed, etc. 


Exercises 


1. Let Καὶ be acompact set and let C(K) be the space of rea/ continuous functions 
on K. Equip C(K) with the sup norm. Show that the set of positive continuous 
functions is an open subset of C(K). 


2. Equip (real) C(B) with the norm 


fi = [is 


(B is a box). Is the set of positive continuous functions open? 


3. Equip C({0, 1]) with the sup norm. Fix a positive integer n, and let P,, be the 
subspace of polynomial functions of degree at most n: 


P(X) = Co + Cx + +++ + yx". 
Show that P,, is a closed subspace of (([0, 17). 
4. If X is a Banach space, every absolutely convergent series in X converges. 
5. In a normed linear space, every convex set is connected. 


6. If a normed linear space has the Bolzano-Weierstrass property, it is com- 
plete. 


7. Equip C({0, 17) with the norm 


WFlh = 1f@) ax, 
Let L be the function on C({0, 1)) 
L(f) = f(0). 


Is L continuous? 
8. Let S be the function from C({0,1]) into C({0, 1]) defined by squaring: 


SCS) = f2. If C(O, 1]) is endowed with the sup norm, is S continuous? What if 
we use the integral norm || - - - ||; ? 
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9. If we pin down the constant of integration, then anti-differentiation defines 
a mapping (function) 


I - 
C(la, b) —> σία, b) 
(If )(x) = [Κη αι. 


Is I continuous relative to the sup norm? The norm || - - - ||; ? 


10. Differentiation maps functions of class C! into continuous functions. Think 
about whether differentiation is continuous for various norms on C !([a, b}) and 
C({a, b)). 


11. Let f be a piecewise-continuous function on [0, 1]-which is not continuous 
and cannot be made continuous by altering its values at a finite number of points, 
e.g., f(x) = 0 on [0, ὁ] and f(x) = 1 on [§, 1]. 


(a) Show that there exist continuous functions f, such that || f — fi ||; 


goes to 0. (|| --- ||; is the integral norm.) 
(b) Show that no sequence {/,,} as in (a) can converge in the normed linear 
space (C, || - - - ||1). Hint: If f, converged in the norm ||. . .||, to a continuous 


function g, what could you conclude about || f — g||,? 
12. Define functions f,, on [0, 1} by 


" O<x< τ 
f(x) = . ἢ 
--- -τι «Λχ-] 
x’? π 
Show that {77,} is ἃ Cauchy sequence in (C, || - - - ||,) but does not converge in that 


space. 
13. True or false? If C'({a, b]) is given the norm 
| fll = sup || + sup | f”| 
then the subspace of polynomial functions is dense in C 1({a, 5)). 


14. If X is a normed linear space, every finite-dimensional subspace of X is 
closed. 


15. True or false? The space Lip (K) in Example 10 is complete. 
16. Is {1 dense in 6”? 


17. The space of functions of bounded variation (Example 5) is a complete semi- 
normed space. 


18. Refresh your memory by giving the proof of this theorem: A uniformly con- 
tinuous function from a (subset of a) normed linear space into a Banach space 
can be extended to a continuous function on the closure of its domain. 


19. Let LZ and M be normed linear spaces and let T be a transformation from L 
into M which is linear: T(cX, + Xz) = cT(X;) + T(X,). Prove that T is con- 
tinuous if and only if there exists a non-negative constant k such that 


ITO WS AUX, all Xe L. 


(As a matter of convenience, the same symbol has been used for the norms in 
L and M, although these norms are presumably different.) 
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20. In the notation of Exercise 19, let B(L, M) be the set of all continuous linear 
transformations from L into M. Show that B(L, M) is a linear space in a natural 
way and that 


— pl TOI 
UPI = sup xT 


is a norm on that linear space. 
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There are several fundamental properties of Banach spaces which 
are not possessed by normed linear spaces in general. We do not have 
time to go into all of these. We shall content ourselves with presenting 
two basic results, along with some of their consequences. These results 
will illustrate the fact that completeness (like compactness) is used to 
prove that certain objects exist. 

We shall be interested in the completeness of subsets of a normed 
linear space L, even in situations where L itself might fail to be complete. 
Of course, the subset K is called complete if each Cauchy sequence in K 
converges to a vector in K. Every complete subset is closed. If L is a Banach 
space, a subset K 1s complete if and only if it is closed. 


Theorem 3. If L is a normed linear space, every finite-dimensional sub- 
space of L is complete and is, therefore, a closed subset of L. 


Proof. Let M be a subspace of (finite) dimension k. Let E,,..., E, 
be a basis for M. The map 


ΝΒ. Μ (use ( in the complex case) 


defined by 7(c,,...,c¢,) =c,£, + --: + c,E, establishes a 1:1 cor- 
respondence between vectors in Αἰ and vectors in M which preserves 
linear operations. Thus 


"ΠΠ (61... +s CedMll = [| Ter, .-- 5 ex) 
defines a norm ||| - - - ||| on R*. Since all norms on R* define the same 
convergent sequences, (R*|||--- |||) is a complete normed linear space. 


Thus M is a complete space. 


In order to show that a subset of a normed linear space is complete, 
it is not necessary to verify that every Cauchy sequence converges. It is 
sufficient, and usually it is technically simpler, to work with sequences 
of the following special type. 


Definition. Let L be a normed linear space. A fast Cauchy sequence in 
L is a sequence {X,} such that 


(6.25) Σ ΠΧ, — Xavi ll < οο. 
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The terminology requires some justification. We need to show that 
a sequence which satisfies (6.25) 1s a Cauchy sequence. To verify this, 
first note that 


lim 2 [|X~ — X41 || = 0. 
Thus, given € > 0, there is a positive integer N. such that 
(6.26) Σ ΠΧ — Χρεα|} « ε. 


Suppose that n > m > N,. Then 
ep Ξε Χορ ΣΧ — Anae ) Mpc ys 
and hence 
Xn — Xall <¥ Xe — Neva 
By (6.26) 
|X, — Xall -Ξ ε, m,n > N,. 
Recall that, in order for {X,} to be a Cauchy sequence, it is not 


sufficient that 
lim [1 Χ΄, — Xn+1 || = 0. 


What we have just verified is that, if || X, — X,.,|| goes to 0 fast enough 
so that 
da Xn — Xnai || << 00 


then the sequence is Cauchy. 


Lemma, A subset K of the normed linear space L is complete if and 
only if each fast Cauchy sequence in K converges to a point of K. 


Proof. One half of the lemma is trivial. Suppose we know that each 
fast Cauchy sequence in K converges in K. Let {Y,} be an arbitrary Cauchy 
sequence. For each positive integer k, there is a positive integer N, such 
that 

ΠΥ ΞΞΥ ll 2, m,n > N,,. 


Evidently, we can choose such integers ΝᾺ so that N, <N,<N,<->-. 
Let 


Then 


Therefore {X,} converges to a vector X Ε K. Since {Y,} is a Cauchy 
sequence and the subsequence {Y,,} converges, the entire sequence 
converges to the same limit X. 
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Our first basic theorem deals with the existence of “fixed points” 
for certain special mappings. 


Definition. If S is a subset of a normed linear space, a contraction of S 
is a function 
S——S 
such that there exists c,O <¢ < 1, with 
(6.27) || TX — TY||<c||X — Y]f, X,Y ε 5. 
Theorem 4. If K is a complete non-empty subset of the normed linear 


space L, and if T is a contraction of K, then T has one and only one fixed 
point, i.e., there exists one and only one point Y € K such that T(Y) = Y. 


Proof. Let X be any point of K. We iterate the contraction T: 
X 
TX 
T?X = T(TX) 


T°X = T(T"X) 


The sequence 7”X will converge to the fixed point Y. We have 
|| TX — T?X|| = || TX — T(TX)|| 
<e||X — TX|| 
and so 
|| T?X — T?X|| <e||TX — T2X|| 
< c?||X — TX|| 


[|| 7"X — TX || < c"||X — TX||. 
Since 0 < c < 1, we have 
3 || TX — TX] < οο. 
n=1 


Thus {T"X} is a fast Cauchy sequence, and since K is complete there exists 
γε K with 


Y = lim T’X. 
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Since 7 is a contraction, 7 is (uniformly) continuous. Thus 
TY = lim T(T"X) 


= lim 7*'!X 


= Y. 


So Y is a fixed point. 
Why is Y the unique fixed point? If TZ = Z then 


ΠΥ —Z||=||TY — TZ]| 
-- ΟΥ̓ -- ΖΙ!. 
Since 0 < c < 1, it must be that || Y — Z|| = 0. 


The result just established not only proves that a fixed point exists, 
but the proof says that iteration (repeated application of T) will lead from 
any point of K to an approximation of the fixed point. Such iterative 
processes occur frequently in mathematics and its applications. 


EXAMPLE 17. Suppose p is a non-negative continuous function on a 
closed interval [a, δ] and we are trying to approximate ./ p. We can try 
Newton’s method, which is to make an educated guess, g, and try to 
improve it by taking 4 = g + 4(p — g*) as a second approximation, 
k =h-+ 4(p — h?) as a third, etc. This method is based, therefore, on 
the mapping 


T 

Cla, b)) a ὦ C((a, b]) 
defined by 

Tf =f +p —f?). 
A fixed point of T is a continuous square root of p, since 7f = f if and 
only if p — [32 =0. Under what circumstances might the contraction 
theorem be applicable? In other words, what might reasonably insure 
that 


[| Tf — Tel|.< ell f — ell. 
where 0 < c < 1? Now 
Tf Τρ = x8" —f?) 
=1(g +f)(g —S) 


50 
|| ΤΥ -- Tell. ΞΞ 5}15 +f ||. 1} —F ll- 


Thus, if we can restrict T to a set of functions f, g, --- whose sup norms 
are bounded by c < 1, we will have a contraction. Suppose, for example, 
that 

[lp ll. =e? «1 
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This is not much of a restriction, since we can always multiply p by a 
constant. Now let 


K={f ε C((a, b);0<f(x) < J p(x)}. 


Evidently K is closed under uniform convergence, and all the functions 
in K are bounded by c < 1. The set Καὶ is mapped by T inside itself: 


T 
K—— Καὶ 


For, 10 <f< _./ p, then 
Γι} - 93.0.3 0 


and 
f+ Mp —-fP)=f+USp τῆῳ» --ἢ 
<f+/p —-S) 
Ξε Κ᾽ 
because 


0<h(/p +f)<1. 


EXAMPLE 18. With the contraction theorem, we can establish the 
existence (and uniqueness) of solutions to certain first order differential 
equations: 


a maa I (x, y). 


We could just as easily deal with Banach space valued functions; how- 
ever, let us deal only with vector (R”) valued functions on an interval. 
We suppose that we are given an interval 7 on the real line and a function 
F which maps (part of) R"*! into πὶ We seek a function Y 


Y 
R— R’ 
which satisfies the differential equation 
(6.28) Y'(x) = F(x, Y(x)). 


We must nail down some constants of integration; hence, we specify an 
initial condition 


(6.29) γώ = Yo 


where x, 15 a given point of J and Y, is a given vector in R”. If we wish to 
think of it that way, (6.28) is a system of differential equations 


V(x) = fi(x, Y,(x), ἘΠῚ 9 VAX)) 


»,(Χ) = κα, ValX), - «+» ValX)) 
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which we have subjected to the initial conditions 
Y (Xo) = Vio, Ι « 7) ΞΞη. 


Now we need to be clear about our assumptions concerning the 
given function F. For simplicity we shall assume that F(x, Y) is defined 
for all x € 7 and all vectors Y near Y,: 


ial F 
IX B(Yo; r) -- R’. 
We assume that F is continuous and that F satisfies a (uniform) Lipschitz 
condition in the second variable: 
(6.30) | F(x, Υι) — F(x, Y2)|< MY, — Y,| 


where M is some constant. These conditions are certainly satisfied if F 
is a smooth function. 

Here is the thing to observe. If the function Y satisfies (6.28), then 
the fundamental theorem of calculus tells us that 


(6.31) Y¥(x) = Y, + [ ” F(t, Y(t) αἱ. 


The expression on the right “operates” on (certain) functions Y to pro- 
duce other functions, and what we are looking for is a fixed point for 
that operation. Where does that integral on the right operate? It operates 
on certain continuous functions from J into R”. So, we consider the Banach 
space (Ὁ; R"), which is the space of continuous functions from / into 
Καὶ endowed with the sup norm (assume J is compact): 

[|G ||. = sup {|G(x)|; x © ἢ. 


The integral in (6.31) will make sense for functions Y whose values are 
such that (g, Y(t)) is in the domain of F. So, let us consider the set of func- 
tions which are uniformly within + of the constant function Y,: 


(6.32) K={Y¥ € CU; R");\|¥ — Voll. <7} 


(Remember that Y, is a function in (6.32).) Then, we have an operation 
(function) T defined on K by 


T 
Κ- CU; R’) 


(TY)(x) = Yo + { ᾿ F(t, ¥(t)) dt. 


Notice that 
\(TY)(x) — Yo|<|x — xo| sup | Fl. 


Therefore, the function TY is uniformly near Y, in an interval about 
Χο. If δ is any number such that 


dsup|F| <r 
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if J is the interval 
(6.33) J={x ε I; |x — x9|< δ), 
and if we take 

K = {Y € CU; R");||¥ — Voll. <r}, 


then 
T 
K—> K. 
(Here, the norm || Y — Y,||.. refers to the supremum over the interval 


(J.) If 6 is sufficiently small, then T is a contraction: 
(TY, )(x) — (T¥2)(x)| = [ LAG, Ὑ{ 0) — FU, Ya) ae 


< |x — xo|M||Y, — Y, ||. 


where M is the constant of (6.30). Specifically, if we require that OM < 1, 
then T is a contraction of K. By the fixed point theorem there is, on the 
interval J, one and only one function Y which satisfies (6.28) and (6.29). 

Clearly then, we have a unique solution to our problem on the entire 
interval J. Take the solution on J; use its values at the end points of J 
as initial data; and solve (6.28) in an interval of length ὃ about each end 
point. Keep doing that. The point is that the ὃ does not depend on x, 
and the local solutions patch together correctly by the uniqueness. 

It will clarify the proof of our second basic theorem if we reformulate 
completeness geometrically. 


Lemma, The subset K (of the normed linear space L) is complete if and 
only if it is closed and has the following property: If {K,} is a sequence of 
subsets of L such that 


(i) each K, is closed and non-empty, 
(ii) Κα, >K, 593 Κ, 9 ---, 
(iii) lim diam (K,) = 0, 


then the intersection ( \ K,, is non-empty. 
Proof. Suppose that K is complete. Let {K,} be any sequence of 


subsets with properties (i), (ii), and (iii). Let Y, be any point of K,. There 
is such a point, by (i). Since by (ii) the sets K, decrease, we have 


Χιί, εΕ eo k>N. 
Therefore, 


1, — X,,|| < diam (Ky), m,n > N. 


Since (iii) tells us that lim diam (ΚΝ) = 0, we have 
N 


lim || X,, — X,|| = 0, 
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that is, {X,,} is a Cauchy sequence. Since K is complete, there exists Κα L 
with 
lim || X¥ — X,|| = 0. 


It is easy to see that X is in each of the sets Ky, because Ky is closed 
and X, € Ky for n> Ν. (Note that the condition on the diameters 
guarantees that the intersection of all the sets K, contains precisely the 
one point X.) 

Conversely, suppose K is closed and has the stated property for se- 
quences of closed subsets. Let {X,} be a Cauchy sequence in K. For each 
positive integer ἢ, there exists a positive integer N, such that 


IX, — Xall< +, k,m> N,. 


We can choose these integers so that NV pa WN Nees Let Καὶ, be the 
intersection with K of the closed ball B(Xy,; 1/n): 


K, = {X € K3||X— Xu ll <<}. 


Obviously each K, is a closed, non-empty set, and diam (K,) < 2/n. Since 
Κι, contains every X, with k > N,, and since N, < N,<N; <---, we 
have K, > K, > Κ > ---. By hypothesis, there is a point X which lies 
in every K,. We have 


ΙΧ -- χε! 2, kN, 


and thus 
lim Χ, -- xX: 
k 


Theorem 5 (Baire Category Theorem). Let K be a complete subset of 
the normed linear space X. Let {U,} be a sequence of subsets of K such that 


(i) each U, is open relative to K; 
(ii) each U, is dense in K. 


Then the intersection 


()U, 


n 


is dense in K. 


Proof. First, we prove that the intersection of all the U, is non-empty, 
as follows. Choose any point X, in U,. Since U, is open relative to K, 
there exists 7, so that the intersection of the ball 


B, = B(X,;1r;) 


with K is contained in U,. Shrink r, a little, and we may arrange that 
Kc B, < U;. Since U, is dense in K, there is a point of U, in Καὶ Ὁ B,. 
Use the fact that U, is open to find a closed ball: 
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K fr) B, = U. 2 
B, cB. 
We may arrange that r, < 4r,. By induction, we obtain a decreasing 
sequence of closed balls: 


BS Bo Bo se 


lim diam (B,) = 0 


BOKEU, 


B,O K+ ©. 


Since K is complete, there is a point of K which lies in every B,. That point 
is (accordingly) in the intersection of all the U,,. 

Now, it 15 apparent that the intersection of the sets U, is dense in 
Κ΄. Just look at the proof. Given X¥ ε Καὶ and ε > 0, use the fact that U, 
is dense in K to choose the first ball B, so that B, > BCX; €). 


Corollary. A Banach space is not the union of a countable number of 
proper closed subspaces of itself. 


Proof. Let L be a Banach space. Let S,, S,, --- be linear subspaces 
of L, each of which is closed and proper: S, = S, τέ L. The assertion is 
that 

LS, 4 L. 


We want to apply the Baire theorem to the open sets U, = L — S,,. 
Certainly U,, is open, but is U,, dense in L? If it were not, then the interior 
of S, would be non-empty, i.e., S, would contain some open ball BCX; €). 
Since S, is a linear subspace, it would therefore contain the open ball of 
radius € about the origin: 


BX; €) = X + BO; €) 
and, hence, would contain all of L because B(0; €) contains a non-zero 
scalar multiple of every vector. We conclude that U, is dense in L. By the 
Baire theorem, the intersection of all the U, is still dense in Z, which means 
that the union 
U δὴ 


has empty interior. 


Some people prefer to state the Baire theorem this way: If K is com- 
plete and {X,} is a sequence of closed subsets of K such that 


K=UK, 


then one of the sets has a non-empty interior relative to K. This is essential- 
ly the form we used in the proof of the corollary. 
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EXAMPLE 19. The Baire theorem makes it vividly clear that the set of 
real numbers is uncountable. The complement of any one point is a dense 
open subset of R. Thus, if we delete any countable set from R, the remain- 
ing set is dense in R. 


EXAMPLE 20. Let L be the linear space of polynomial functions on 
some interval [a,b]. There are various interesting norms on L, e.g., 


|| 
He ese ee ils 
[17 etc. 


But, the corollary tells us that there is no norm relative to which L is a 
Banach space, because L is the union of an increasing sequence of finite- 
dimensional subspaces 


L=US, 
S,={f © L; degree (7) <n} 


and finite-dimensional subspaces are always closed. (See Theorem 3.) 


EXAMPLE 21. Here is one way to show that “practically every” con- 
tinuous function is nowhere differentiable. Consider a closed interval J = 
[a, b] on the real line and the space C(/) of continuous (real-or complex- 
valued) functions on J. We equip C(J) with the sup norm as usual. If m, ἢ 
are positive integers, define U,,,, to be the set of all functions f in C(/) 
with this property: For each x Ε J, there exists a point ¢ € J such that 


(i) O<|t—x|<1/m 

(1) | f() — f(x) | > πε — |. 
Now each U,,,, is open in C(J). We leave the proof as an exercise. Also, 
U.nn 15 dense in C(J). To see that, just approximate each f uniformly by a 


piecewise linear function which consists of steep line segments joining 
points near to one another. By the Baire theorem 


S=)U,5 


is dense in C(/). What does a function in S look like? If f € S, then given 
any x in J and given any positive integers m,n, there exists ἢ € J with 


O<|t—x]<— 
SO —FO)| < ν. 
l= xX 


Clearly such an f is not differentiable at any point of 1. We conclude 
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(at least) that the set of continuous yet nowhere differentiable functions 
is dense in C(J). 


Exercises 


1. The normed linear space L is complete if and only if its closed unit ball 
B(O; 1) is complete. 


2. Consider €”, the space of all sequences X = (x,, x2,...) of real [complex] 
numbers, with the sup norm. Let 
K={X © (°:x,;=0° and limx,= Ἕ 1}. 
(a) Show that K is a closed subset of €~. 
(b) Let T be the shift mapping 
T(X) — (0, X15 X22, X35+-- ). 
Show that T(K) c K. 
(c) Show that T is a weak contraction: 
[Τ(ΧῚ — TY) Ilo < || X — YI. 
but that Τ᾽ has no fixed point in K. | 
3. 10 < c < 1 and 115 the closed interval [0, c], then 


(TAX) =1 + [fat 


defines a contraction on C(/) whose fixed point will satisfy the differential equa- 
tion f’ = f with the initial condition {(0) = 1. What approximations to the 
solution do you obtain from the sequence 7(0), Τ2(0), 73(0), ... ? 


4. What contraction operator on C([0, c]) would you use to obtain a solution 
of f” + f =0, f0) =a, f(0) = b? 


5. Let S be a square in the plane. Remove from S any sequence of line segments. 
The set which remains is dense in 5S. 


6. Prove that the set of rational numbers is not the intersection of any countable 
family of open subsets of the real line. 


7. Prove that each set U,,, in Example 21 is open and dense in C(J). 


8. True or false? In solving the differential equation Y’ = F(x, Y) (Example 
18), if F is a function of class C” in each variable, then the solution is of class 
C~. How about the case where F is real-analytic? 


9. Let K be a compact set in a normed linear space, and let {.X,,} be a sequence 
of distinct points in K (a countable subset). Prove that there exists on K a real- 
valued continuous function which takes on different values at the various points 
xX ke 

F(X) FIX),  kAn. 
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6.5. Compactness 


In a finite-dimensional normed linear space, every closed and bound- 
ed set is compact. In the general normed linear space there is no such 
simple characterization of compact sets. Of course, we might not expect 
one in an incomplete space since, evidently, any Cauchy sequence which 
fails to converge provides an example of a bounded sequence with no 
convergent subsequence. But, even in a Banach space L, the unit ball 


{Xe L3|| XI < 3 
is closed and bounded, yet is never compact unless L is finite-dimensional. 
Before we prove this, let us consider some concrete examples, illustrating 
how the non-compactness of the unit ball comes about in a Banach space 


in which there are infinitely many independent directions in which to 
move. 


EXAMPLE 22. In any of the sequence spaces ¢7, 1 << p< o, the 


vectors 
ΧΙ Sh 0) 
As = (021,044) 
X, = (0,0,1,...) 


constitute a sequence in the ball: || X,||, = 1 for every n. It has no con- 
vergent subsequence because 


[| Xe ΜΠ p= 2, mn. 


In other words, the sequence wanders about, with no two of the vectors 
in the sequence ever coming near to one another. 

This same phenomenon can be reproduced easily in a Banach space 
such as C({0, 1]), with the sup norm. For instance, if g, = 1/,/n f,, where 
f, is the tent function in Figure 25, we have || g,||.. = 1 for each n and 
|g, — 2, || > 1 whenever m τέ ἢ. 


We are now going to show that “separated” sequences of the type 
exhibited in Example 22 exist in the unit ball of any normed linear space 
L which is not finite-dimensional. We shall show that, given γ, ὁ <r < 1, 
there exists a sequence of vectors X Ε L such that ||X,||<1 and 
|| X,, — X,|| => r whenever m + n. We shall obtain the vectors inductively, 
roughly as follows. Having chosen X,,..., X,, choose X,,, such that, 
among points in the unit ball of L, it is as far away as possible from the 
subspace spanned by X,,..., X;. 
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We need some elementary facts about distances. If M is a non-empty 
subset of a normed linear space L and_X is a vector in L, the distance from 
X to M is (of course) 


d(X;M) = inf ||X — ΖΙ! 


Lemma. Let M be a closed subspace of L and let X © L. 
(1) d(X; M) = 0 ifand only if X εἴ M. 
(ii) Jf Y © M, then d(X + Y; M) = d(X; M). 
(ii1) d(cX; M) = [c| d(X; M) for all scalars c. 
Proof. (i) is a restatement of the fact that M is a closed set. (ii) Let 
Y € M. Then 


UX + Y;M) = int ||X + ¥— ZI 


Since M is a subspace and Y € M, as Z ranges over M the vectors Y — Z 
do also: 
{Y—Z;Z ε M}=M. 
This proves (11). Conclusion (ili) is trivial if c = 0; it merely says Ὁ € M. 
Ifc + 0, 
d(cX, M) = inf |[cX — Z]|| 
ΖΕΜ 


τῆ inf |x — Lz. 
ZEM [οἱ 


As Z ranges over M, so does (1/c)Z, that is, (1/c)M = M. Therefore, 
d(cX; M) = |c|d(xX; M). 


Here is the key fact for our present purposes. 


Lemma. If M is a proper closed subspace of a normed linear space L, 
then 
sup d(X; M) = l. 


Π{ΧΠῚῚΠΞῚ 


Proof. 1 ||X||< 1, then d(X; ΜῚ <||X|| because M contains the 
zero vector. Therefore, 


0 -Ξ-ὀ sup dX;M)<1 
Π{Χ[ΞῚ 
Let 0 < r< 1. We shall show that there is a vector X in the unit ball 


of LZ for which d(X; M) = r. That will prove the lemma. 
Since M is proper and closed, there is a vector X, in L such that 


dX ,;M)=6> 0. 


By part (11) of the lemma above 


d{ Xo: M) ΞΕ αὶ 
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Since r < 1, there is a vector Y € M such that 


=X — Y <1. 
Let 
X =—X, — ¥. 


Then || X|| < 1 and, by part (ii) of the lemma above, d(X; M) = r. 


Theorem 6. Let L be a normed linear space which is not finite-dimen- 
sional. If 0 <r <1, there is a sequence of vectors {X,} in L such that 


(i) || X,|| < 1 for each n; 
(1) || X,, — X, || > r whenever m + ἢ. 


Proof. Let X, be any (non-zero) vector in the unit ball of Z. Let M, 
be the subspace spanned by X,. Since L is not finite-dimensional, M, 
is a proper subspace of L. Theorem 3 tells us that M, is closed. By the 
last lemma, there is a vector XY, € L such that || X,||< 1 and d(X,; M,) 
>> r. Let M, be the subspace spanned by X, and X,. Then M, is a proper 
closed subspace of L, so there is a vector X¥, € L with ||X,;||< 1 and 
d(X;,;; M,) >r. We continue inductively: Having chosen X,,...,X,, 
choose X,,, in the unit ball such that the distance from X,,, to the 
subspace spanned by X,,..., X,, is at least r. 


Corollary. The normed linear space L is finite-dimensional if and only 
if its closed unit ball{X © L;||X|| < 1} is compact. 


Proof. If L is finite-dimensional then L may be identified with R* 
[or C*] under some norm. The equivalence of all norms on R* [C*] tells 
us that the closed unit ball for that norm is closed and bounded. By the 
Heine-Borel theorem it is compact. 

If the unit ball for Z is compact, each sequence in it must have an 
accumulation point. Thus no separated sequence can exist in the unit 
ball, as in Theorem 6. Conclusion: L must be finite-dimensional. 


The non-compactness of the unit ball tells us that, in a normed linear 
space which is not finite-dimensional, the interior of every compact set 
is empty, i.e., every compact set is “thin”. Nevertheless, compact sets in 
such normed linear spaces are sometimes important, and we wish to say 
something about them. 


Theorem 7. Let K be a subset of the normed linear space L. The follow- 
ing are equivalent: 
(i) K is compact. 
(11) K is sequentially compact, i.e., every sequence in K has a sub- 
sequence which converges to a point of K. 
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(iii) K is complete and, for every € > 0, K can be covered by a finite 
number of balls of radius ε. 


Proof. Assume (i). Let {X,} be a sequence in K. If {X,} has no ac- 
cumulation point in K, then each X in K has a neighborhood U, which 
contains X, for only a finite number of n’s. Since K is compact, a finite 
number of the sets Uy covers K? 

Assume (ii). Certainly every Cauchy sequence in K converges to a 
point of K, because the sequential compactness guarantees that a sub- 
sequence converges. (If a Cauchy sequence has any convergent subse- 
quence, then the entire sequence converges to the limit of the subsequence.) 
Let ε > 0, and let us show that K can be covered by a finite number of 
e-balls. Assume the contrary. Select any point X, in K. The ball B(Y,; €) 
does not cover K; hence, there is a point X, in K with ||X, — X,||> ε. 
But B(X,, €) and B(X,; €) together do not cover K, so there is a point 
X; in K with || X, — X,|| > e and ||; — X,|| > e. By induction, we 
obtain a sequence of points X, in K such that (for each n) 


|X, — X, || > ε, Ke Hae dy 


Such a sequence (obviously) cannot have any convergent subsequence. 
Thus (1) implies (iii). 

We reverse our merry-go-round for a moment to show that (ii) fol- 
lows from (iii). Assume (iii), and let {Y,} be any sequence in K. Cover K 
by a finite number of (closed) balls of radius 1. One of those balls must 
contain X, for infinitely many values of n. Choose one such closed ball 
K,. Cover K, by a finite number of closed balls of radius 4. Intersect each 
closed ball with K,. One such intersection must contain X, for infinitely 
many values of ἡ. Select one such intersection K,. Cover K, by balls of 
radius 4, and repeat. By induction, we obtain a nested sequence of closed 
sets in K: 


(i) Kj DK,>K;>-:-- 
(1) lim diam (K,) = 0 


(iii) K, contains X, for infinitely many values of k. 


For each k, choose X,, in K,. By (ii), {X,,} is a Cauchy sequence. Since 
K is complete that (sub) sequence converges to a point of Καὶ. 

Now we are ready to show that (1) follows from (iii). Assume (iii). 
Then (as we just proved) K is sequentially compact. That means that 
each countable open cover of K has a finite subcover. Therefore, in order 
to prove that K 15 compact, we need only show that every open cover of 
K has a countable subcover. We do that in (almost) the same way as we 
did in proving the Heine-Borel theorem, Theorem 12 of Chapter 2. We 
find a replacement for the balls with rational centers and rational radii 
which we used in that proof. 
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First, we must find a countable subset of K which is dense in K. We 
can do that as follows. For each n, choose a finite number of points in 
K such that the balls of radius 1/n about those points cover K. Put all 
those points together, and we have a countable subset of K. It is dense in 
K, because (for each n) every point of K is within distance I/n of one of 
the points. 

Second, enumerate the points in that countable set: X,, X,,.... 
For each k, consider all open balls about X, with rational radius. The 
collection of all those balls (as & varies) is countable. Enumerate them 
B,, B,, B;,.... The important property of this sequence of open balls is 
this: Every open set which intersects K contains one of the sets B,. 

Now, let {U,} be any open cover of K. Let N be the set of those 
positive integers n such that B, is contained in one of the sets U,. For each 
n © N choose an index ἃ, so that B, is contained in U,,. Then 


{U,,;n © N} 


is a countable (sub) cover of K. 
We have established 


(i) ——> (i) 


(ii1) 
Corollary. Let S be a subset of a Banach space X. The following are 
equivalent. 


(1) The closure of S is compact. 
(1) Each sequence in S has a subsequence which converges (in X). 
(11) For each € > 0, 5 can be covered by a finite number of balls of 
radius €. 


We are now going to prove Ascoli’s theorem, which characterizes 
the compact sets in one particular infinite-dimensional normed linear 
space—the space of continuous functions on a compact set. This is a very 
useful theorem, because it guarantees that certain sequences of continuous 
functions have uniformly convergent subsequences. 

Let K be a compact set (in some normed linear space). Let C(K) be 
the space of continuous complex-valued functions on K, endowed with 
the sup norm 


Il fll. = sup {| Χ(Δ}ὲ; X © Ky}. 


Let S be a subset of C(K), 1.e., a family of continuous functions on K. 
Let’s see what it means for S to be compact in the Banach space C(K). 
Suppose that (the closure of) S is compact. 

Let € > 0. We can cover S by a finite number of € balls. In other 
words, there exist functions f,,...,/, 1n S such that every f © S is in 
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one of the open balls: 


(6.33) LAS —Si lle < €}, k=1,...,n. 


(There are finitely many functions in S such that every fin S is uniformly 
within € of one of those functions.) As some wise man once observed, 
this means that the functions in S are “equicontinuous,” that is, if X is 
near Y, then f(X) is near f(Y), uniformly for all fin S. To be specific, 
given ε, choose f,,...,/, as in (6.33). Each ἔχ is uniformly continuous, 
hence 
IfL(X) -f.(Yl1<e, %X%, Ye K, |X—YV|<d. 

Let ὃ be the least of the numbers 6,, so that 
(6.34) If.(LX) --- ((Υ}] < ε, X,YeEK,|X—Y|<o. 
Let f be any function in S. There exists k, 1 < k <n, such that fis uni- 
formly within ε of f,. Therefore, 
[.Χ(Χ) — FOI SNS) -- fA) +A) -- AO + AY) — FO) | 

If(X) —f)|<3e, x*,Yek, |[X—Y|<o. 


Definition. Let S be a family of functions on the set K. If X ε K, the 
family S is equicontinuous at X if; for each € > 0, there exists 6 > 0 such 
that 

|f(X) — f(Y)| < ε, YeK,|X—Y|<06,f ε 5. 
The family S is (uniformly) equicontinuous if, for each ε > 0, there exists 
6 > 0 such that 


[(Χὺ —f(Y)|<e X,YeK,|X—Y|<6,feS. 


We have just shown that each compact subset of C(K) is (uniformly) 
equicontinuous. Ascoli’s theorem is essentially the converse to that. 
Before we prove Ascoli’s theorem, let us look at some examples, in order 
to be certain that we understand the meaning of equicontinuity. (The 
reader may prefer to go on to the proof of the theorem and then return 
to the examples.) 


EXAMPLE 23. Let K = [a, δ], a closed interval on the real line. The 
simplest (interesting) equicontinuous family of functions on [a, δ] is ob- 
tained from estimates on derivatives. For smooth functions f 


70) -- ἐ- [7 αἱ; 


hence 
if) —f01=|f Oat] 
<|x — y|sup| "|. 


279 


280 


Normed Linear Spaces Chap. 6 


Therefore, consider a family such as the collection of all functions of class 
C! for which the derivatives are bounded by 3. For the functions in that 
family, we have a uniform estimate on the distance from f(x) to f(¥): 


7} =f/O)| = 3)|x— JF. 
So “given €, we can choose one 6 which will serve simultaneously for all 
f in that family.” That is the meaning of equicontinuity. Obviously we 
can extend the example to the following case. Let M > 0 and let S be the 
family of all functions on [a, b] which are differentiable and have deriva- 
tives bounded by M. Then S is an (uniformly) equicontinuous family 
of functions on fa, δ]. 


EXAMPLE 24. In a variety of problems (e.g., solving differential or 
integral equations) one meets integral operators (operations) of the fol- 
lowing type. Suppose we are given (say) a continuous function K on the 
rectangle [a, b] Χ [c, 4]. Then we can define a transformation 


T 
C([c, d]) —> C([a, δ]) 


by 
(Thx) = [χῶκα, ἡ at. 
Now 
(THY) — (TY) = f° FOUR, ἢ — KY, lat. 
Therefore, 
(TAY) — TAO < μα,» [FOL at 
where 


M(x, Y) = sup {| K(x, t) — K(V, 0D |;¢ <t < dh. 
Since K is continuous, 
lim M(x, ¥) = 0. 


|x-y|]-0 
Therefore, it should be clear that 7 transforms { /; ] | 7] < 1} into an 
equicontinuous family, i.e., 


d 
S=(Tf; | Is1<0 
is equicontinuous. 


EXAMPLE 25. Let D be the open unit disk in the complex plane. Let f 
be a complex-analytic function on D: 


f(2) = aye". 


Suppose for the moment that fis analytic on the closed disk D, i.e., that 
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the power series for fconverges in a disk of radius 1 -- €. As we have seen, 
we then have Cauchy’s formula (5.38): 


f@=x) «ΚΘ a 


_ 1 [” fe) oc 
~ In} _eF — a ay: 


Thus, 


f@)—BI=|55 |" fee" ag - why] 6] 


οί. — Ἃς 


1 {* ιϑ 
«ἢ [ ire) 


axe; 49 


(eF — aye? — 


<la—Bliflle-ge [ le? —alte*— pl ao. 


Suppose we restrict a, B to a closed subdisk 
D, = ἰα; [α] <r}. 


Then 
|e? —a/>1— |a| 
-: 1 -- 
so that 
(6.35) | f@)—f(P\<|a—pi tle a pen, 


Cd — ry 


Obviously then, the same inequality holds if f is merely analytic on D. 

From (6.35) we see immediately the following. If M > 0, the family 
of complex-analytic functions on D which are bounded by M is (uniformly) 
equicontinuous on each compact subdisk D.. This equicontinuity is 
another of the special important properties of complex-analytic functions. 


Lheorem ὃ (Ascoli). Let K be a compact set (in a normed linear space). 
Let S be a subset of C(K). Then 5 is compact if and only if S is closed, bounded, 
and (uniformly) equicontinuous. 


Proof. We proved the “only if” part of the theorem in motivating the 
definition of equicontinuity. What remains to be proved is that a closed, 
bounded, equicontinuous family is compact in C(K). It will suffice to 
establish sequential compactness. Let { f,} be a (uniformly) equicontinuous 
sequence of complex-valued functions on K which is bounded: 


I Salle < M, ἘΞ es ae 
We must extract from {f,} a subsequence which converges uniformly. 


Here is the basic fact: If {f,} is a bounded equicontinuous sequence and 
if € > 0, there exists a subsequence such that all the functions in the sub- 
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sequence are uniformly within e of one another (i.e., all the functions in 
the subsequence lie in one e€-ball in C(K)). Suppose we have established 
this basic fact. The proof of the existence of a uniformly convergent sub- 
sequence is then quite easy: Choose a ball of radius 1 in C(K) which con- 
tains f, for infinitely many values of n. Apply the “basic fact” to the /f,’s 
which are inside that ball, and obtain a ball of radius 4 which contains 
f, for infinitely many n’s, etc. By induction we obtain a nested sequence 
B, > B, > «++ in C(K) such that (i) diam (B,) — 0, and (ii) each B, 
contains f, for infinitely many values of k. Choose f,, in B, and we have 
the subsequence we seek. 

So, all we need to do is prove the basic fact. Given €, determine ὃ 
by the equicontinuity: 


(6.36) \|f(X)-f(Y)\<%, X%YERK, ||X-—Y\||<6. 


Choose points X,,...,Xy in K so that the open balls B(X,; 06) cover 
K. In order to ensure that 


ll fn —Falle < € 


it will suffice to have 


(6.37) lin Xe) -- XL <p, k=1,...,N. 


That follows from (6.36) and the choice of X,,..., Xx: 


"ἢ | F.(X%) —fAX)|- 


But it is trivial to find a subsequence such that, for any f,, and f, in the 
subsequence, the values are within €/3 at each of the finitely many points 
X,,..-,Xy. That can be done for any bounded sequence of functions on 
any set, because {(f,(X,), .. .,f,(Xw)} is a bounded subset of ΟΝ, 


Corollary. If {f,} is a sequence of complex-valued continuous functions 
on a compact set K and if the sequence is bounded and (uniformly) equi- 
continuous, then there exists a subsequence {f,,,} which converges uniformly on 
K. 


Corollary. Let D be an open set in the plane and let {f,} be a sequence of 
complex-analytic functions on D. If the sequence is bounded, 


μ᾿ ΞΜ, n=1,2,3,... 


then it has a subsequence which converges, uniformly on every compact 
subset of D. 


Proof. As we showed in Example 25, a bounded sequence of complex- 
analytic functions is uniformly equicontinuous on each closed disk inside 


Sec. 6.5 Compactness 283 


D. It should then be clear from Ascoli’s theorem that, if Καὶ is any compact 
subset of D, there is a subsequence which converges uniformly on K. 
We want to find one subsequence which does this, simultaneously for 
all compact sets K in D. 

Write D as the union of an increasing sequence of compact sets. For 
instance, let K, be the set of points z such that 


(i) ze D; 
(ii) 12] <n; 
(ili) the distance from z to the boundary of D is not less than 1/n. 


Then Καὶ, 15 compact, 
K, (Kk, K,;c-:-- 
| Ρ-:ι)Κ,. 


Suppose we are given a sequence of analytic functions f,, and the sequence 
is bounded on each compact subset of D. Pick subsequences S; > S, > 


S; > --- so that all the functions in the subsequence S, are uniformly 
within 2~” of one another on the set K,. Then, any subsequence 
Suc Ε δὲ 


converges, uniformly on each compact subset of D. 


Exercises 


1. True or false? If {f,,} is a sequence of functions of class C! on the closed 
interval [a, δ], then { f,} is equicontinuous if and only if the sequence of deriva- 
tives { f;,} is bounded in sup norm. 


2. True or false? If {F,} is a sequence of functions of bounded variation on 
[a, b] and if each F, has total variation at most 1, then the sequence of functions 


fix) = |" dF) 
is equicontinuous. 


3. If you did not know Ascoli’s theorem, would you regard it as obvious that 
any sequence of smooth functions on [0, 1] with derivatives bounded by 6 has a 
subsequence which converges uniformly on [0, 1]? 


4. Let 
T 
be the map defined by 
(TAX) = J OK (x, ἢ αἱ 


where Καὶ is a continuous function on the square [0, 1] Χ [0, 1]. If Bis a bounded 
subset of (([0, 1]), then the set 7(B) has compact closure in C((0, 17). 
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5. Let {1 be the space of (absolutely) summable sequences: 
MX = (X;, X2, X3,...) 
| X {Π᾿ a ps Xi - οὐ. 


In the Banach space {1, consider the subset 


SiS {X3 |x] <a 2, εν 
n 
Is S compact? 
6. Let f be a continuous complex-valued function on the closed interval [0, 1] 


such that /(0) = 0 and |] /||.. < 1. Show that the sequence of powers f, 72. 
f3,...1s equicontinuous if and only if || f ||. < 1. 


7. Give an example of a continuous function f on [0,1] such that f(0) = 0, 
[1 f ||. = 1 and the sequence of its powers { f”} has compact closure in the normed 
linear space consisting of C({0, 1]), together with the integral norm || --- [}1. 


8. If S and T are compact subsets of the normed linear space L, then the 
algebraic sum S + 7 is compact. (Hint: Addition is continuous.) 


9. Let C,(R) be the space of bounded continuous complex-valued functions on 
the real line, equipped with the sup norm 


fll. = sup [7] 


If f e¢ C,(R) and  ε R, the ¢-translate of fis the function 
7) = f(x + ἡ. 


The function f is periodic if f, = f for some t + 0. Show that, if f is periodic, 
then the set of all translates of fis a compact subset of C,(R). (Hint: Look at the 
map which sends each ¢ into /;.) 


10. Refer to Exercise 9. The function f € Cz is called almost periodic if the set 
of all translates of fhas compact closure in C,(R). Prove the following: 


(a) The sum of two almost periodic functions is almost periodic; hence, 
every exponential polynomial 


n 
P(x) ἘΞ »> cette, th € R, Cy & C 
k=1 


is almost periodic. 
(b) The product of two almost periodic functions is almost periodic. 
(c) The space of almost periodic functions is a closed subspace of Cz(R). 


11. The function f € C,;(R) is almost periodic if and only if, for each € > 0, 


there exists 0 > 0 such that every interval of length 6 contains a number ¢ for 
which 


If —Sille « ε. 


(Hint: Cover the set of translates by €-balls.) 


12. Let n be a positive integer and let ap, - - . , a, be continuous real-[complex-] 
valued functions on a closed interval J on the real line. Suppose that a, has no 
zeros on 1. Then any function f which satisfies the differential equation 


Sec. 6.6 Quotient Spaces 285 


af - αὐ Γ΄ +--+ +a,f” =0 
on J will be of class C". Give C*(/) the norm 


eles le eres ep lcs 
(a) If M is the set of solutions of the differential equation, then M isa closed 
subspace of C7(J). 
(b) Use Ascoli’s theorem and the fact that a, has no zeros to show that the 
closed unit ball of M is compact and, hence, that M is finite-dimensional. 


6.6. Quotient Spaces 


Suppose that L is a linear space and M is a (linear) subspace of L. We 
shall associate with the pair (L, M) a linear space L/M, known as the quo- 
tient (or difference) space of Z modulo M. Then we shall use quotient 
spaces to discuss the completion of a normed linear space. 

Loosely speaking, L/M will be the linear space obtained by doing 
linear operations in LZ under the agreement to regard two vectors as 
“equal” whenever they differ by a vector in the subspace M. 

If X and Y are vectors in L, we say that X is congruent to Y modulo 
M and write 


(6.38) X¥=Y, modM 


if the difference (XY — Y) is in M. Congruence modulo M has the three 
properties of equality which one uses repeatedly: 


(a) Foreach Yin L, X¥ = X. 
(b) If ΧΙ = Y, then Y= X. 
(c) If ¥ = Yand Y=Z, then ¥ = Ζ. 


These result from the definition, together with the facts that (a) the zero 
vector is in M, (b) the negative of any vector in M is again in M, (c) the 
sum of two vectors in M is again in M. 

Congruences may be added or multiplied by scalars: 


(d) If X¥, = Y, and X, = Y2, then X¥, + X,=Y, + ¥Y3. 
(ec) If X= Y, then cX¥ ΞΞ ΟΥ̓ 


These follow immediately from the fact that M is closed under vector 

addition and under scalar multiplication. It follows from (d) and (e) 

that we may legitimately form linear combinations of congruences: If 
X;=Y;i=1,...,n and if c,,..., 0c, are scalars, then 
Οὐχὶ ΞΕ αν ἐπα g = CT | ae ied aT xs 

Properties (a)-(e) constitute the formal justification for carrying out 

linear operations in L, while disregarding vectors in M (vectors which 


are congruent to 0). Let us consider a simple case in which we might be 
interested in doing just this sort of thing. 
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EXAMPLE 26. Suppose we are interested in problems or questions 
which involve integration of piecewise-continuous functions over an 
interval [a, b]. From the point of view of integration, certain functions in 
the linear space L = PC({a, b]) may be disregarded, namely, the func- 
tions which have the value 0 except at a finite number of points. These 
are precisely the piecewise-continuous functions f such that 


fifl=o. 


If we let M be the subspace consisting of these “null” functions for integra- 
tion, then 
f=g8, mod M 


means that f(x) = g(x) except for a finite number of values of x and, in 
particular, implies that 
b b 
Es 


We might put it this way. If we chose to regard functions which are 
congruent modulo M as being the same, the integral would never know 
the difference. . 

There is another way of looking at linear operations modulo the 
subspace M, and this leads to the concept of the quotient space L/M. 
Suppose we ask whether calculations modulo M correspond to bona fide 
linear operations and relations in some vector space. If so, then a given 
vector XY and all vectors congruent to X must be names for (must rep- 
resent) the same vector in the new space. The vector which they represent 
will be the set 


(6.39) O(X) = {Y¥ ε« L; X= Y, mod M}. 


Note that Q(X) is a very special type of subset of L. It is the X-translate 
of the subspace M: 


Q(X) =X+M 
={X+Z;Z 6 M}. 


In Figure 26 this is pictured for two choices of X in the case where L is 
ΚΞ and M is a straight line through the origin. 

Note that the following conditions are equivalent: 

Gi) OX) = Q(Y). 

. (ii) X and Y lie in the same translate of M. 

(iii) ¥ + M= Y+M. 

(iv) Y is in the X-translate of M. 

(v) Ye Q(X). 

Now we let L/M be the collection of all (distinct) translates of M. 
The vector addition and scalar multiplication will be defined on L/M in 
such a way that 
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o 


M 


FIGURE 26 


O(X,) = O(X,) ΝΣ Ο(Χ, aT X2) 
cOx) = O(cX). 


In other words, the operations in L/M will be defined in such a way that 
the map 


(6.40) 


(6.41) ι-- ἡ 


is a linear transformation. 


Theorem 9. The set L/M, consisting of all translates of the subspace M, 
can be given the structure of a linear space in such a way that the map 


Q(X) = X+M 
is a linear transformation. This structure on L/M is unique. 


Proof. Let T, and T, be translates of M. If Q is to be linear, the rule 
for adding T, and T, must be this: Choose any vectors X, € T,, that is, 
any vectors X; such that O(X;) = T,, and let 


(6.42) T, ἘΣ 7) = OX, =e X,). 


How do we know this is a well-defined rule for adding translates? We 
must verify that the translate O(Y, + X,) does not depend upon the choice 
of X, and X,, 1.6., that it is the same for all X¥, € T,, X, ε T,. But, this 
is simply a restatement of property (d) of congruence modulo M—sums of 
congruent vectors are congruent. 

If T is a translate of M and Q is to be linear, the rule which defines 
cT must be this: Choose any vector X ε T7, that is, any X such that Q(X) 
= T, and let 

cT = QO(cX). 


Once again it must be verified that Q(cX) is the same set for all vectors 
X in T. This is a restatement of property (e) of congruence modulo M— 
scalar multiplication preserves congruences. 
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We have shown that there is one and only one way to define a vector 
addition and scalar multiplication on L/M so that (6.40) is valid. In order 
to show that L/M with these operations is a linear space, we must verify 
properties (1)-(9) of Section 6.1, the defining properties of a linear space: 
Addition is associative; addition is commutative; scalar multiplication 
distributes over addition, etc. This is extremely easy to do, because we 
have the map Q satisfying (6.40). Each property follows from the corre- 
sponding property in the space L. Of course, the zero vector in L/M is 
O(0) = M. 


The linear space L/M is called the quotient space of L modulo M, 
and Q 15 called the associated quotient map. 

If Z is a normed linear space and M is a closed subspace of L, then 
there is a “natural” norm on the quotient space L/M. The norm of a trans- 
late of M is defined as its distance from the origin in L: 


(6.43) THI] = inf] YI. 
Of course, this may be rewritten 
ΠΧ + ΜΠ} = inf || X + Z|| 
ZEM 


which is equivalent to saying, “The norm of QCY) is the distance from XY 
to M.” 


Lemma. Let L be a linear space, M a subspace of L, and let || - - - || be 
a semi-norm on L. The function ||| -- - ||| defined by (6.43) is a semi-norm 
on L/M. It is a norm provided M is closed (under sequential convergence). 


Proof. We verify the three properties of a semi-norm. 


(i) For each translate T we have 0 - ||| Τ|]] -Ξ ©, because 
{\| Y||; Υ ε T}is a non-empty set of non-negative numbers. 

(1) To verify the triangle inequality, let T, and T, be translates of 
M. Let X, € T, and X, € T,. Then (X, + X,) € (7, + T,). Therefore, 


WIT, + TINS. + X| 
< {{Χ 4} + IX | 
Since this is true for all X¥, € T, and all Χ, € T,, we have 


WIT - ΤΠ} STs Ul + IT? Ill. 
(1) If 7 is a translate of M and c is a scalar 
I[|e7||| = inf |] eX]| 
XET 


= inf |c||| | 
XET 
= |e| inf || X]| 
ΧΕΤ 
= [6}{Π|7}}}- 


Sec. 6.6 Quotient Spaces 280 


To verify the assertion as to when ||| - - - ||| 1s a norm, we consider a 
translate Τ᾽ such that ||| ΤΊ = 0. By definition, this means that there is a 
sequence {X,} in 7 such that 


lim || X,|| = 0. 


Fix a vector X in 7, so that T= X + M. Then each YX, has the form 


X,=X+4+ Z,, Z, Ε M. 
Thus 
lim || X + Z,|| = 0 


or 
lim (—Z,) = X 


If M is closed, this implies that XY Εε M. But then T = ΕΜ, the zero vector 
in L/M. 


The [semi-] norm ||| - - - ||| is called the quotient [semi-] norm on L/M. 
If, in a particular context, L is understood to be endowed with a particular 
norm, then usually it is assumed that L/M is equipped with the quotient 
norm. It may be worth remarking that the quotient map 

Q 
L ———- MM 

is then a continuous linear transformation from L onto L/M. The (uni- 
form) continuity of Q is immediate from the fact that Q 1s linear and norm- 
decreasing, ||] Q(Z) ||| < |Z ||: 


OX) — Ο()1} = lox” — VIII 
<||X — Yl. 


The most important case of the last lemma is the one in which L 
is a normed linear space and M is a closed linear subspace. In a finite- 
dimensional normed linear space, every subspace is closed. In general 
this is not the case, e.g., Μ may be dense in LZ as the polynomials are 
dense in the space of continuous functions on an interval. In such cases, 
the quotient space L/M is of relatively little interest. We do not have time | 
here to go into the why’s and wherefore’s of this. Suffice it to say that, 
in a normed linear space, it is the closed linear subspaces which are of 
prime importance. 
There is one other (special) case of the last lemma in which we are 
interested. 


EXAMPLE 27. Suppose that ||; - - || is a semi-norm on L and M is the 
subset of null vectors: 


= {X € δ; ||X|| = 0}. 
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It is easy to see that M is a linear subspace of L and that M is closed under 
sequential convergence. Therefore, the quotient semi-norm on L/M is 
actually a norm. It has a very special property in this case: 


OCA] = {11}. 


What the quotient structure on L/M does is to provide a natural way of 
“dividing out” the null vectors—forgetting about them, one might say. 

We dealt with a special case of this situation in Example 26, where 
L was the space of piecewise-continuous functions, endowed with the 
integral semi-norm 


Wl = fist 


Dividing out by the functions of semi-norm zero amounts to creating 
a new linear space by identifying functions which differ only at a finite 
number of points. 

The vectors in the quotient space are not easy to describe, except to 
say they are certain classes of functions. Therefore, one usually handles 
this special type of quotient space as we did in Example 26: Carry out 
linear operations in L, using congruence modulo null vectors in place 
of equality. 

One technical remark may be in order about the proof of the last 
lemma. We did not discuss convergence, open sets, closed sets, complete- 
ness, etc. in the context of a semi-normed linear space. One can do this 
easily and naturally, because things are formally the same as in a normed 
space. But there can be one confusing point: A sequence may have more 
than one limit. For 

lim X, = X 


lim X, = Y 


only tells us that ||X — Y||=0 and, consequently, that Y= X¥ + Z, 
where Z is a null vector. Therefore, if a sequence converges, the set of 
its limit points is a translate of the subspace of null vectors. In particular, 
every closed linear subspace contains the space of null vectors. This is why, 
when we divide out by a closed subspace, the quotient semi-norm au- 
tomatically becomes a norm. 


Theorem 10. If L is a complete semi-normed linear space and M is a 
closed linear subspace, then L/M is a Banach space. 


Proof. We want to show that L/M is complete. It suffices to show that 
every fast Cauchy sequence in L/M converges. Let {T,,} be such a sequence: 


(6.44) SIT, — Tas lil < 0. 


Let Q be the quotient map 
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Q 
[ena se 


M 


By the definition of the quotient norm, for each positive integer n there is 
a vector Z, 1n L such that 


O24) = Τ, -- 734 
12. lle Dea lle 2 Ὁ; 
Select any vector X, such that O(Y,) = Τ᾽, and then define 
X, =X, -- (Ζ, -ἰ --- - Ζ,..), W= 2, By 4, oe os 
The definition of X, is so constructed that 
A, hs = Za No? Oy buoy 
By (6.44) and (6.45) we have 


(6.45) 


DX, -- Χ, || < Vila — ΤΠ] + 427° 
< oo, 
Since L is complete, the fast Cauchy sequence {X,,} converges, say 
lim X,, = xX. 


Since Q is continuous, 


lim O(X,) = O(%). 


But, since QO(X,)=T, and Q(X,.1 — X,) = Ο(Ζ,) = Τίνι — Tr we 
have O(X,) = T, for each n. Thus {7,} converges in L/M to the vector 
T= O(X). 


Exercises 


1. Endow R? with the standard norm (length). Let M be any linear subspace of 
R3, e.g., a line or plane through the origin. Let Μ΄. be the set of all vectors which 
are orthogonal (perpendicular) to M 


M+ ={X;<X,Z> =0, all Ze M}. 


(a) Show that M- is a subspace of R?. 
(b) Let χε R?}. Show that the infimum 


inf || X¥ — Z| 
ZEM 


is attained for precisely one vector Z € M and (for this vector) X — Z isin M-. 
(c) Each vector X € R3 is uniquely expressible in the form X = Y + Z, 
where Y © Mt and Z ε M. 
(4) Identify the quotient space 3) Μ with M-|, by showing that each 
translate of M contains precisely one vector Y in M+ and that the (quotient) 
norm of the translate is the length of Y. 
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2. Let L be the space of continuous real- [complex-] valued functions on the 
closed interval [—1, 1], endowed with the sup norm. Let M be the set of even 
functions in L: 

f(—x) = f@), =o = 1. 
(a) Show that M is a closed subspace of L. 
(b) Show that each translate of M contains precisely one odd function: 
g(—x) = —g(x), =Ve 4h 
(c) Identify ZL/M with the space of odd continuous functions. 
(4) Is there a simple relationship between the quotient norm on L/M and 
the sup norm on odd functions? 

3. Let M be a subspace of the linear space L. Let T be a /inear transformation 

from L into some linear space N: 
of 
L—>N 
T(cX + Y) = cT(X) + T(Y). 
Suppose that 7(Z) = 0 for every Z ε M. Show that there is a linear transforma- 
tion S, from L/M into N, such that T is the composition of S and the quotient 
map Q: 


Q 
.--. 5 


A Is 


N. 


4. Let M be a closed subspace of the normed linear space L. Show that U is an 
open subset of L/M if and only if Q-1(U) is an open subset of L. 


5. Let T be a linear transformation from a normed linear space L to a normed 
linear space N. Prove that the following are equivalent 
(i) T is continuous. 
(11) There exists a constant M > 0 such that 
ITO < Μ||Χ}, all Xe L. 


6. Let L be a normed linear space and let 7 be a linear functional on ZL, i.e., a 
linear transformation from L into the scalar field. Show that f is continuous if 
and only if the null space {X; f(X) = 0} is a closed subspace of L. 


7. Let M be a closed subspace of the normed linear space L. Prove that the 
image of the open unit ball B(O, 1) under the quotient map 


QL 
| ome ae τ᾿ 


is precisely the open unit ball about the origin in L/M. : 
δ. Let L be a linear space. There are natural ways to define sum and scalar 
multiple for arbitrary subsets of L: 


A+ B={X+ Y;X€A,YeE B} 
cA = {cX; X € A}. 
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(a) Look at properties (1)-(8) defining a linear space in Section 6.1. Show 
that the collection of all subsets of L, together with these operations, satisfies all 
these conditions except for the existence of a zero vector and the existence of 
negatives. 

(b) Suppose M is a subspace of L. Show that L/M, the set of translates of 
M, is a vector space under the operations above (and, indeed, that these opera- 
tions coincide with the linear operations we introduced for L/M). 

(c) Now prove that (b) describes the only way that a family of subsets can 
be a linear space under the defined operations on sets. In other words, show that, 
if a family of subsets of Z forms a linear space under these operations, it is the 
family of translates of some subspace of L. 
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It is frequently useful to know that a normed linear space L can be 
completed to a Banach space, i.e., that we can embed L densely in a 
Banach space. If Z is not complete, it means that L has certain “holes” 
where the limits of some Cauchy sequences are missing, and we want to 
know that it is possible to fill in all the holes. 

If S = {X,} is a Cauchy sequence in LZ which does not converge, we 
want to adjoin to LZ a limit point for the sequence. We shall arrive at the 
completion of Z by adjoining a limit point for each such sequence. But, 
we must be careful. If we have two Cauchy sequences {X,}, {Y,} and if 
{X, — Y,} converges to 0, then obviously we should not adjoin to X 
different limit points for the two sequences. 

Given any normed linear space L, we form Cauchy(ZL), the space 
of all Cauchy sequences in L. If 


S= {X,} 
T = {Y,} 
are two Cauchy sequences in L, their sum is 
S+T={X,+ Y,} 


and that is again a Cauchy sequence. Scalar multiplication in Cauchy(L) 
is defined by 

cS = {cX,}. 
All that we are saying is this. Consider the space of all sequences of vec- 
tors in L, 1.e., the space of all functions from Z, into L. Then Cauchy(Z) 


is a linear subspace of that space of sequences. 
We introduce a semi-norm on Cauchy(ZL): 


(6.46) I| S|] = lim ||, | 


That makes sense because, if {X,} is a Cauchy sequence, then {|| _X,||} is 
a Cauchy sequence in R. We said “semi-norm” because we can have 
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|| S|| = 0 without S = 0. In fact, ||S|| = 0 if and only if the sequence 
S = {X,} converges to 0. 

Now we remark that Cauchy(Z) is complete, that is, every Cauchy 
sequence in Cauchy(L) converges in the space. Let {S,} be a Cauchy se- 
quence in Cauchy(ZL): 


Ss, <a (Δ, 1: X25 Xn35 τὰν J: 


For each k, there is a positive integer N, such that 
(6.47) Xim — Xen] <p, m,n > Ny. 


Choose any one of the points X,,, with n > N, and call that point Y;,. 
Then 


(6.48) I¥e—Xell< τς ΠΣ ΝΕ 


Let 7, be the constant sequence 
Τὶ = (Y;., Ge Grae ἢ 


This is certainly a Cauchy sequence. Furthermore, according to (6.48), 
| 
(6.49) Si — Tell <p 


Since {S,} is a Cauchy sequence, (6.49) makes it apparent that {7,} is a 
Cauchy sequence, i.e., {Y,,} 1s a Cauchy sequence in L. But {T,,} obviously 
converges in the space Cauchy(ZL); it converges to the Cauchy sequence 


S = (,, Y,, Y;, <9 ἢ» 


So S, also converges to S. 


Theorem 11, Each normed linear space is a dense subspace of a Banach 
space. 


Proof. We have the complete space Cauchy(L), and it is quite clear 
how to embed LZ in Cauchy(L). We identify X with the constant sequence 
(X, X, X,...). The only problem is that we have only a semi-norm on 
Cauchy(Z). We take care of that by dividing out the null sequences, as 
in Example 27. 

Let Null(L) be the space of Cauchy sequences of semi-norm zero, 
that is, the space of null sequences in 7, (ones which converge to zero). 
By Theorem 10, we know that the quotient space associated with the 
subspace is a Banach space. Furthermore, L is embedded in that Banach 
space by 

P Q 
(6.50) L —»> Cauchy(L) —>  NuIRER 
where P(X) is the constant sequence P(X) = (X, X, X,...) and Q is 
the quotient map. The composition Qo P is norm-preserving because 
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both Pand Qare. In particular Qo Pis 1 : 1. It also preserves linear opera- 
tions because both P and Q are linear transformations. We have left to 
the exercises the proof that the image of L is dense in the quotient Banach 
space. 


Exercises 


1. Verify that the natural injection of ZL into Cauchy(Z) carries L onto a dense 
subspace. 


2. Use the result of Exercise 5 of Section 6.6 to prove the following. If Tis a 
continuous linear transformation from the normed linear space 7, into the 
Banach space M, then T can be extended uniquely to a continuous linear trans- 
formation from L to M. (L denotes the completion of L.) 


3. If L is an inner product space, show that its completion to a Banach space is 
also an inner product space. 
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¢. The Lebesgue 
Integral 


7.1. Motivation 


We are going to discuss a type of integral which is both a more general 
and a more powerful tool than the Riemann integral we dealt with in 
Chapter 4. This integral, developed by Henri Lebesgue shortly after the 
beginning of this century, is more general in the sense reflected in these 
two statements. 


(i) Every function f which is Riemann-integrable is Lebesgue- 
integrable and the two processes assign the same integral to Καὶ 

(ii) The class of Lebesgue-integrable functions contains many func- 
tions, e.g., unbounded functions or functions on unbounded domains, 
which are not properly Riemann-integrable and which were treated 
traditionally by “improper integrals”. 


This increased range of application of the integration process would 
not by itself justify the effort we will invest in developing the Lebesgue 
integral, because it is not too difficult to embellish the Riemann process 
to be able to deal with such mathematical statements as: 


1 Ι o9 =, 
——= dx = 2, { CP ἐχ-- ,,γπ. 
Le ε: 
The significance and power of the more general method of integration 
derive from the fact that the class of Lebesgue-integrable functions is 
“complete”, in several senses. This refers, in the first instance, to the con- 
cept of completeness discussed in Chapter 6. 
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(iii) The linear space of Lebesgue-integrable functions is complete 
relative to the (semi-) norm 


Ill = [1s 


that is, it is a (semi-) Banach space. 


The completeness can be expressed in other forms, but appreciation 
of their interest or significance requires adoption of a particular point of 
view. Instead of regarding integration as a process for calculating the inte- 
grals of specific functions, we regard integration as an operator (function) 
on the space (set) of all integrable functions, 


Kf) -- [1 


This operation is linear 


K(of + 8) = cl(f) + 1(8) 


Kf)>0 if f>0. 
Lebesgue’s brilliant idea was to focus on another critical property of 
integration, which we may call weak continuity: If f,; > f, >/f; > ---is 
a monotone-decreasing sequence of (non-negative) integrable functions 
and if it converges pointwise to 0, then 


lim I(f,) = 0. 


and positive 


The second form of “completeness” states, roughly speaking, that 
Lebesgue extended the integral to the largest class of functions to which 
it could reasonably be extended. 


(iv) The space of Lebesgue-integrable functions is the largest sub- 
space of the space of complex-valued functions to which the integral can 
be extended so that it remains linear, positive, and weakly continuous. 

Strictly speaking, Lebesgue did not deal with the completeness in the 
form described in (iv). He phrased the completeness in terms of the concept 
of measure (length, area, volume, etc.). The language will be simpler if we 
deal only with area for the time being. Similar remarks are applicable to 
(k-dimensional) measure in R*. 

Area is related to the integration of functions on R? in this way. If 
E is a subset of ΚΖ and if k, is the characteristic function of £, 

1 ΧΕ Ε 
0, ΧΕΕ, 


then the area of E should be the integral of its characteristic function 


A(E) = Ϊ Κε. 


(7.1) k(X) = 
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If k, is integrable (in the sense of Lebesgue), this “definition” of area makes 
sense. Lebesgue focused on the properties of the area function which are 
analogous to the cited properties of the integral operator. Area is additive, 


A(DU E)=A(D) ὦ ACE) if DO E=26 
it 1s positive, 
A(E) > 0 
and it is weakly continuous: 
ΤΡ > 2,3 £45 3 and ( ) Ε, = @, then lim A(E,) = 0. 


There are two ways to reformulate the weak continuity of area. The 
first reformulation was the heart of the ancient Greeks’ “method of exhaus- 
tion”: If δ᾽, «- δ, -- δ. c---+ is an increasing sequence of sets which 
“exhaust” the set S, 


δ᾽ = ἰ | δ᾽, 
then A(S) = lim A(S,). The relationship to weak continuity is provided 


by the fact that E, = S — S, is a decreasing sequence with empty inter- 
section if and only if S, is an increasing sequence with union S. Further- 
more, if area is additive, we have A(S,) + ACS — S,) = A(S), so that 
A(E,) tends to 0 if and only if A(S,) tends to A(S). The second reformula- 
tion notes that if S is the union of the increasing sequence {S,} and we 
define 7, = S,, T, = S, — S,-1, 1 => 2, we have 


(7.2) SSW ΡΤ ΞΕΟΙ;- Assy. 


The additivity of area shows that A(S,) = A(T,) +---+ A(T,); hence 
A(S,) tends to A(S) if and only if 


(7.3) A(S) = 2, A(7,). 


This reformulation of weak continuity, which says that (7.3) follows from 
(7.2), is called the countable additivity of area. It is, perhaps, the most 
convenient form to work with because it includes the additivity property 
as a special case. 

Of course, we have not stated to which class of sets the area function 
is being applied. We tend to take the concept of area more for granted 
than we do “integral”, as if every subset of the plane could be assigned an 
(numerical measure of) area in a meaningful way. But there are some very 
wild subsets of the plane. Using the axiom of choice (see Appendix) sets 
can be identified to which no appropriate measure of area can be assigned ; 
that is, there exist sets which are “non-measurable”. These sets are so 
horrible that one might reasonably take the attitude that area should be 
discussed only for the simplest sets. But even the ancient Greeks seemed 
to be intrigued by the idea of discovering the answer to this question: 
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Starting from the simple sets, to what class of sets can the definition of 
area be extended? All of us can be impressed by what Lebesgue was able 
to prove, by developing his integral. | 

(v) The largest class of bounded subsets of the plane to which area 
can be extended so that it remains positive and countably additive is the 
class of bounded sets which have Lebesgue-integrable characteristic 
functions. Furthermore, for every such set, the area is the integral of 
the characteristic function. 


We wish to stress the fact that the “completeness” statements (iv) 
and (v) are phrased rather loosely. They convey the correct general ideas, 
but a number of technicalities must be cleared up before they can be trans- 
formed into precise mathematical assertions. It is not necessary, nor would 
it be advisable, for us to stop to do this now. 

If the completeness properties, (i11)-(v), describe a special property 
of Lebesgue integration, they do not convey adequately what its signifi- 
cance is. We shall try to give some indication of this. 

The development of a complete theory of integration placed powerful 
new tools in the hands of mathematical analysts. Within a few years, 
significant advances were made in the study of complex-analytic functions. 
Complete theories became possible in the study of Fourier series and 
Fourier transforms. Lebesgue’s methods lent themselves to further gen- 
eralization, in which other “measures” were used in place of length, area, 
volume, etc. The formulation of modern probability theory would not 
have been possible without these developments. All of this has had its 
impact not only on the structure of mathematics as a discipline but also 
on the range of its applications. Let us close this introductory section with 
a few tangible examples of the tools and results which this more general 
integral will provide. 

One of the recurring questions about integration is when “the integral 
of the limit (function) is the limit of the integrals.” In other words, if f,, 
fz, ... are integrable functions on, say, a box and if 


f(X) = lim.) 


when can we conclude that fis integrable and 
{ f=lim Ϊ 4 


If each f, is continuous and the convergence is uniform, there 15 no prob- 
lem; but this is far from an adequate result for many purposes. For example, 
in any situation where each f, is continuous but f 15 not, the conver- 
gence cannot be uniform. To contrast the results of this type which hold 
for Riemann vs. Lebesgue integration, let us list the following properties. 


(a) Each f, is integrable. 
(Ὁ) The sequence { f,} is bounded. 
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(c) {f,} convergés pointwise to αὶ 
(d) fis integrable. 


(0) Jf=lim | f,. 
The strongest theorem for Riemann integration on a box in R* is: 
(a) + (6) + (ὦ) + @) = (ὁ). 


For Lebesgue integration, (d) can be moved out of the hypothesis and 
into the conclusion: 


(a) + (δ) + (ὦ) => @ + (0). 


Thus, for the Lebesgue integral on a box, bounded pointwise convergence 
allows us to interchange the order of the limits. This is a very powerful 
technical tool. 


(vi) If {f,} 1s a bounded sequence of Lebesgue-integrable functions 
on a box in R* which converges pointwise to the function f, then ΚΑ is 
Lebesgue-integrable and 


[f= lim | f, 
Parallel to the illustration just given is an equally powerful tool for 
dealing with unbounded functions: 


(vii) If f, <f, <f; <-+-+ is an increasing sequence of (real-valued) 
Lebesgue integrable functions and the sequence of integrals { f, is 
bounded above, then it follows that the limit 


f(X) = lim f,(X) 


is finite at “almost every” point X, that fis a Lebesgue-integrable function, 
and that 


[7-- {πὶ [. 
(We will define “almost every” in Section 7.3.) 


Tools such as (vi) and (vil) are very closely interwoven with the 
completeness properties, (ili)-(v). They enable one to give complete 
theories of the following type. 

In Chapter 5 we showed that if f is a complex-valued piecewise- 
continuous function on the interval [—z, z] and if 


>> c,e"* 
n=— oo 


is the Fourier series for /: 


(7.4) =e | Ε Πλ)ε- "" dx 
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then 


(7.5) [Ἰὼβ -- Σ lel 


and (consequently) the partial sums 
N 
Sy(x) = > ce 
n=-N 


converge to fin the “square-integral” norm, i.e., 


Lf — svt = (f"_ L4G) — sx dx) 


tends to 0. Lebesgue’s integral provides the following beautiful converse: 
If {c,} 1s any sequence of complex numbers such that | 
dy 1en|? < 0° 

then there is a complex-valued function f on [—z, 2] such that f and f2 
are Lebesgue integrable and {c,} is the sequence of Fourier coefficients of f. 
This gives a complete correspondence between L?, the space of Lebesgue 
square-integrable functions on [—z, a] and ¢2, the space of square-sum- 
mable sequences of complex numbers. The correspondence is 1: 1 if we 
agree to disregard null functions, those which are zero at almost every 
point; and it preserves norms, (7.5). 

_ The reader may like to keep these few examples in mind as reasons 
why we want to develop the Lebesgue integral. It is (111) which we will use 
as the guiding principle for how we develop it. 


7.2. The Setting 


In this section we shall give a precise description of the setting in 
which we will develop the Lebesgue integral. We begin by reviewing the 
small amount of information we need about the (Riemann) integral of a 
continuous function. We shall change the context slightly from that of 
Chapter 4, in order to be able to work directly with integrals over (all of) 
R* instead of over a fixed box. 


Definition. A complex-valued function f on R* is said to have compact 
support if there exists a compact set K such that f{(X) = 0 for each X not 
in K. 


Note that the sum of two functions of compact support is a function 
of compact support: If f; vanishes outside K, and f, vanishes outside K,, 
then f, +f, vanishes outside K, U K,. Therefore, the set of functions of 
compact support is a linear subspace of the space of all (complex) func- 
tions on R*. 
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Notation. We denote by C,(R*) the set of all continuous complex- 
valued functions on R* which have compact support. Evidently this is 
also a linear subspace of the space of functions on R*. 


We want to define the Riemann integral over R* of a continuous func- 
tion of compact support. We need the following simple result. 


Lemma. Let B be a closed box in R* and let B be a closed box which 
contains B. Let f be a continuous function on B such that f = 0 on B — B. 


Then 
j,f=Jf 


Proof. The integral of f over B is approximated by sums 
50, Ρ, ΤῊ = Σ f(T,)m(B,) 


where P = {B,,..., B,} is a partition of B and 7 -- {T,,...,T7,} is ἃ 
choice of points T, ε- B,. Given any P, we can find a partition P of B 
such that 


(a) P c P,i.e., each box B, is one of the boxes of P; 
(Ὁ) []PI< ||P Il. 


Given a “choice” T, extend it to a choice T of points in the boxes P, sub- 
ject to the condition that, if a box in P is not one of the boxes of P, then 
the point chosen in the box is outside B. Then, since f= 0 on B — B, 


S(f, P, T) = S(f, P, T). 
If the mesh || P|| (and therefore the mesh ||P ||) is small, then S(/, P, T) 
will be near ᾿ 7 and S(f, P, T) will be near [. f. This establishes the 
lemma. 


Definition. If ἰδ a continuous function on R* which has compact sup- 
port, the Riemann integral of f over ἘΞ" is 


fra jis 


where B is any closed box such that f = 0 outside B. 


The point 15, of course, that f does not depend on B, as long as 
B 


f = 0 outside B. In other words, if B, and B, are closed boxes outside of 
which f vanishes, then 


f= a 
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This follows immediately from the last lemma, which tells us that 


ω : 
ie = ee: 


Theorem 1. The Riemann integral on ΓΟ, (ΕΒ) has these properties. 


(i) It is linear, 
[ (f+ 8) -- ο[{-ὸ [8 
(1) It is positive; i.e., if f > 0, then 
{f>0. 


(11) It is weakly continuous; i.e., if f, > f, > ἔς > +--+ is a monotone- 
decreasing sequence of non-negative functions in C(R*) which tends 
pointwise to O, then 


lim | ἢ, = 0. 


Proof. It is only (iii) which is not obvious. To prove it, choose a 
closed box B such that f; = 0 outside B. Since 0 < f, ΞΞ ἤ,, we see that 
ft, = 9 outside B. Accordingly, 


[.-[.» n= 1,2,3,.... 


Since (a) each f,, is continuous, (Ὁ) {f,} converges monotonely downward 
to 0 on B, and (c) B is compact, Dini’s theorem (Theorem 1 of Chapter 5) 
tells us that { f,} converges uniformly to 0 on 8. Therefore, 


lim | f,=0. 


Now we can explain the idea behind what we intend to do. We con- 
sider the normed linear space which consists of C.(R*) together with the 
norm 


(7.6) 7th = 12} 


This norm is called the L!-norm on C,(R*). 
We are going to complete the normed linear space (C,, || ---||,;) toa 
Banach space. Since the integral is uniformly continuous on C.(R*): 


[,- Js <fIlf—sl=llf—aglk 


it will extend uniquely to a (linear) function on the completion. Essentially, 
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this completion will be the space of Lebesgue-integrable functions and the 
extension of i will be the integral on this space. 


In Chapter 6 we showed how to complete any given normed linear 
space to a Banach space. We shall not need to use this general result, but 
its method of proof is what we will keep in the backs of our minds to guide 
what we do. The basic idea of the completion process is very simple: With 
each Cauchy sequence in the normed linear space we associate an abstract 
vector to serve as the limit of the sequence, subject to the condition that, 
when the difference of two Cauchy sequences converges to 0, they should 
have the same associated “abstract” limit vector. Thus, in the case at hand, 
we want to associate a limit to each sequence { f,,} in C,(R*) which is Cauchy 
in the L!-norm: 


(7.7) lim Il fn — Soll: = 9. 


But we want to assign a limit which is more than an abstract vector. We 
want the limit to be a function on R*. Therefore, what we would hope is 
that the Z!-Cauchy condition (7.5) would imply that the sequence { f,,} 
converges pointwise. Unfortunately, this is not always so. What is true, 
however, is that if { f,} is a fast Z’-Cauchy sequence: 


Yl fa πο νον [ὁ 


then { f,} converges pointwise at “almost every” point of R*. Thus we 
obtain a function to employ as the limit of the Cauchy sequence. The 
functions which arise in this way will be the Lebesgue-integrable functions 
on R*. That is, a Lebesgue-integrable function on R* will be a (complex- 
valued) function defined at “almost every” point of R* such that there 
exists a fast L'-Cauchy sequence in C,(R*) which converges to it pointwise 
at “almost every” point. The integral of such a function will (of course) 
be the limit of the integrals of the functions in the Cauchy sequence. It 
will require a bit of effort to formulate all of this precisely and to verify 
that it is valid. 


Exercises 


1. Let J be a closed (bounded) interval on the real line and let J be an open 
interval which contains J. Construct a continuous function f on R! such that 
f=1on/,f=0 outside J, and0O <f<lonJ—I. 


2. Let K be a compact subset of R* and let U be a bounded open set which con- 
tains K. Let g(X) = d(x, κα — U). Let J be the closed interval [a, Ὁ], where 


a = inf g 
K 

b = sup g. 
K 


By proper choice of a function f as in Exercise 1 show that the composition 
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h = f o g satisfies 
(i) h € CR‘); 
Gi) h=1onK; 
(11) h = Ο outside U; 
(iv) O<hkh<lonU—K. 


3. Let f be a continuous function on R*, e.g., a polynomial, and let K be any 
compact subset of R*. Use the result of Exercise 2 to show that there is a con- 
tinuous function on R*¥ which has compact support and agrees with fon Καὶ. 


4. Let f be a function in C,(R*). Prove that the real and imaginary parts of f 
are in C,(R*), as is | f |. 

5. Prove that the L!-norm is a norm on C,(R*). 

6. Let K and U be as in Exercise 2 and let ἢ be the function constructed there. 


Then h > h? > h? >.--- is a monotone-decreasing sequence of functions in 
C.(R*). Does it converge uniformly? What would you call 


lim { he 
7. If fis a complex-valued function on R*, the support of fis the closure of the 
set [X; f(X) = 0}. 


(a) Show that a function has compact support if and only if its support is 
compact. 

(b) Show that, if f has compact support, the support of f is the smallest 
compact set K such that f = 0 outside Καὶ. 


8. Every function in C,(R*) is bounded, hence the sup norm is a norm on 
CAR‘). 

9. Is the normed linear space (C,, || - - - ||..) complete? If not, describe its com- 
pletion (as a subspace of the bounded continuous functions on R*). 


*10. If K is a compact subset of R* and fis a continuous complex-valued func- 
tion on K, there exists g € C,(R*) such that g(X) = f(XY), X € K. 
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We have made several references to convergence at “almost every” 
point, and we have indicated that this phenomenon will of necessity arise 
in the discussion of Lebesgue-integrable functions. In this section, we will 
give a precise formulation of it, using “sets of measure zero”. We will gain 
a certain technical advantage if we discuss the more general concept of 
“outer measure” and then relate both it and “measure zero” to Riemann 
integrals. 


Definition. If S is a subset of R*, the outer measure of S is 
(7.8) m*(S) = inf Σ᾽ m(B,) 
{Bn} n=1 


where the infimum is taken over all countable coverings of S by boxes. 
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Some explanation is in order. Given the set S, there are various 
sequences of boxes B,, B,, B; which cover S: 


Sc ι} Β,. 
π- 1 


The collection of all boxes which have rational vertices (corners) is one 
such covering. This tells us that 0 < m*(S) < oo. This is not, however, 
the sort of countable covering which motivates the definition of outer 
measure. We have in mind, rather, a sequence of very tiny boxes such that 


Ci) B,, B,, B;,...just barely cover S; 
(11) there is very little overlap of the various boxes B,,. 


For such a sequence {B,} the sum }) m(B,) ought to be just a little bit 


larger than the number which we would hope to call the measure (k- 
dimensional volume) of S. As we use smaller and smaller boxes, we can 
improve on (i) and (ii), getting closer and closer to the “measure” of S. 
The smallest number we can hope to attain as a limit is the infimum, which 
we have named m*(S). We call this the “outer measure” of S rather than 
just the “measure” of S, for two reasons. First, we obtained it by squeezing 
down on S from the outside. Second, we indicated in Section 7.1 that it is 
not possible to assign a measure to all subsets of R* so that the properties 
we expect measure to have are preserved; 1.e., there exist “non-measurable” 
sets. 

To most people it would appear more natural to use finite coverings 
by boxes in defining outer measure. There is even a systematic way to go 
about this which has a “natural” appeal. We subdivide all of R* into boxes 
using a gridwork of some given “mesh”. Then to estimate the outer 
measure of a given set S, we add up the areas of those boxes which con- 
tain points of S. (See Figure 27.) By using grids of finer and finer mesh, 
one ought to obtain sums which converge downward to the measure (or 


FIGURE 27 
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outer measure) of S. Obviously, for unbounded sets this would involve 
infinite coverings, but for bounded sets only a finite number of boxes 
formed by any given gridwork would touch S. For very “nice” sets this 
process is adequate, but, as Lebesgue realized, it 15 inadequate for many 
(bounded) sets which one must deal with in order to have an adequate 
theory of integration. (See Example 1.) 


Lemma. If S,8S,,S.,... are subsets of ἈΠ such that 
Sc UJ Sa 
n=1 
then 
(7.9) m*(S) < Σ m*(S,). 
n=1 


Proof. If m*(S,) = co for some n, the inequality is trivially true. If 
m*(S,) < co for every n, we proceed as follows. Let € > 0. By the defini- 
tion of m*(S,), there is a sequence {B!} of boxes which covers S, and 
satisfies 


Σ m(Bt) < m*(S1) + -ὅ- 
Similarly, there is a sequence of boxes {B?} which covers S, such that 
Sy m(B2) < m*(S2) + $- 


In general (for each k = 1, 2,3,...) we can find a sequence of boxes 
{ Bk} which covers δὰ such that 


Sy m(BE) < mS.) + Fe 


Since the sequence of sets BY, Bk, ΒΕ, --- covers S, and the sequence 


{S,} covers S, the double sequence {B¥; k = 1, 2,3,...,n = 1,2, 3,...}, 


is a countable covering of S by boxes. Furthermore, 


Dd m(Br) = d) 2, m(Br) 


Therefore, 
(7.10) m*(S) < > m*(S,) + €. 


Since (7.10) holds for every ε > 0, the lemma is established. 


Note that the lemma includes as a special case the (trivial) fact that 
m*(S) < m*(T) whenever S c 7, as one sees by taking S, = @, n> 2. 
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It also includes the assertion that, for any sets S and 7, we have 
(7.11) m*(S U T) < m*(S) + m*(T). 


It is not always true that (7.11) is an equality when S ἡ T = @; that 1s, 
outer measure is not an additive function on the class of all subsets of R*. 
We cannot prove this assertion now, because it would involve giving an 
example of a “non-measurable”’ set. 

Let us hasten to point out that closed sets, open sets, and any sets 
we can generate from these will turn out to be “measurable” and m* will 
provide their appropriate measures. This last assertion is not trivial to 
verify, even for the simplest of sets. For instance, suppose B is a box in 
R*, Obviously it had better be the case that 


m*(B) = m(B). 


But how do we know this is true? You will be guided through a proof of 
this in the exercises. 

We need to use the following simple relationship between Riemann 
integrals and outer measure. 


Lemma. Let f be a non-negative continuous function on R* which has 
compact support. For each number c > 0 


(7.12) [ f > cm*({X; f(X) > c}). 


Proof. Letc > Oand S, = {X; m*(X) > c}. Let B be any closed box 
such that f = 0 outside B and let P = {B,,..., By} be a partition of B. 
We will show that there is a choice T of points Τ᾽, < B, such that S(f, P, T) 
> cm*(S,). We choose the points 7,,..., Ty as follows. If B, 1 δ, ~ @, 
choose some point T, € B, O S,. If B, ( S, = @, let Τ᾽ be any point in 
B,. Since f > 0 


50, Ρ, ΤΊΣ DfT )ym(B,) 


a a 2 Ε γμι(Β)). 


(7.13) 


Since c > 0 and f = 0 outside B, the set S, is contained in B. Thus the set 
of boxes 8; for which δὲ  B; # ὦ is a (finite) covering of S,, and so 


m*(S.) -- νῶν γ(8Β)). 
5. ὦ ΒΒ) 
If we combine this with (7.13) we have S(/, P, T) > cm*(S,). 


Definition. A subset S of R* is called a set of measure zero if 
m*(S) = 0. 


Note that any subset of a set of measure zero is a set of measure zero. 
The following extends this observation a bit. 
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Lemma. The following conditions on a set S are equivalent. 


(i) S is a set of measure zero. 
(ii) S is the union of a sequence of sets of measure zero. 
(111) For each € > O there is a sequence of sets S,,8,,... such that 


ScUS, 
n=1 


>) m*(S,) < €. 


n=1 
Proof. The implications (i) => (ii) = (ili) are trivial. If (11) holds, 
then we know by the inequality (7.9): 


(7.14) 


m*(S) < Y m*(S,) 
n=1 
that m*(S) < ε, for every € > 0. Hence S is a set of measure zero. 


EXAMPLE |. The set consisting of a single point is a set of measure 
zero; indeed, any countable subset of R* is a set of measure zero. Let’s 
see why. Let S be a countable set and let X,, X,, X3, ... be an enumeration 
of the points of δ. Let ε > 0. For each ἢ, choose an open box B, such that 


X, Ε Δ, 


Then the sequence {B,} covers S and 


5 m(B,) < ε. 


Thus m*(S) < e€, for every € > 0. 

As a special case of this example, we see that the set of points in R* 
which have rational coordinates is a set of measure zero. In particular, 
the rational numbers comprise a set of measure zero on the real line. Every 
student of analysis should be acutely aware of the demonstration of this, a 
demonstration which can be rephrased as follows for the positive rational 
numbers. Enumerate these numbers (points) in some way, e.g., according 
to the scheme 
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Suppose we have an interval of length € > 0. Divide the interval in half 
and use one half to cover the first rational point. Divide the remainder in 
half and use one half (one quarter of the original) to cover the second 
rational point. Divide in half again and cover the third rational point, etc. 
This describes a scheme, infinite though it may be, for covering the posi- 
tive rational points by a sequence of intervals, the sums of whose lengths 
is €. So we conclude that the “length” of the set of these rational points is 0. 
Note that if we cover the rational points in [0, 1] by any finite number of 
intervals, the sum of the lengths of those intervals is at least 1, because the 
rational points are dense in [0, 1]. This shows why, even for bounded sets, 
we use infinite coverings to define outer measure. 

The fact that we must use countable coverings rather than just finite 
ones to see that the set of rational points is a set of measure zero 15 related 
to the fact that we must use Lebesgue’s rather than Riemann’s process to 
integrate the function 
1, «x irrational 


fox) = | 


You will recall (Example 4 of Chapter 4) that this function f is not Rie- 
mann-integrable on [0, 1]. It is (or, will be) Lebesgue-integrable and we 
will have 


0, x rational. 


[ f@) dx =1 


because f(x) = 1 except on a set of measure zero, which is a negligible 
set from the Lebesgue point of view. 


EXAMPLE 2. The Cantor set (Example 15 of Chapter 2) is an uncount- 
able set of measure zero in R'. This set K is obtained by deleting from the 
closed unit interval a sequence of open intervals /,, J,, 7;, --- such that 
>> m(I,) = 1. It seems plausible that m*(K) = 0 because in forming K 


we removed from [0, 1] all of the length. In fact, this is easily made precise. 
After we have removed the open intervals /,,---,/,, the part of [0, 1] 
which remains is the union of a finite number of non-overlapping intervals. 
(See, for example, Figure 5 on p. 59.) These remaining intervals cover 
K and the sum of their lengths is 


1— > m7) 


j=1 


a sum which tends to O as n increases. 
We come now to the first non-trivial result. 


Theorem 2. A subset S of ἘΠ is a set of measure zero if and only if 
there exists a sequence {f,} such that 


(i) each f, is a real-valued continuous function of compact support 
on R*; 
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Gi) f,;<f,<f,;<---; 
(11) the sequence of integrals { f, is bounded; 
(iv) lim f,(X) = co for every point X ε S. 


Proof. Suppose we are given a sequence { f,} with properties (i)-(iv). 
By (ii) and (iii), the sequence of integrals { a is monotone-increasing and 


bounded above, hence it converges. We are now going to replace { f,} bya 
subsequence of itself. Let g,; =/f,, where ny <n, <n; <--- and 


[. > lim [, τς ἢ 
Then the sequence {g,} satisfies (1)-(iv) and 
| Gas ey ee oe a 5. ας, 
Let € > 0. We will show that m*(S) < ε. Let 
E, = 1%: 5..1(Χ) — 8{X) > a}: 

Then (iv) tells us that 

Se U Ε,. 
Why? Because, if X ἐ E, for every n, we have 


4,.1(Χὺ — 8(X) < 5 aoa Ve es Pare 


and thus 
gAX) = gi(X) + [g(X) — 2(X)] + --- + [2(X) — »,.-.(Χ}} 


< 2X) + ¥ [eri -- 2%) 
< ς,(Χ) + > ΒΕ 


| 
= 5.(Χ) + — 
/ ε 
for every 1; hence lim g,(X) < oo. 
Since (g,.1 — £,) <4", the inequality (7.12) tells us that 


Ι * —n 
Pea (Ε,) <4 


m*(E,) < ε2-". 
Therefore, 


(7.15) Y m*(E,) <e. 
n= 1 
It follows from (7.9) and (7.15) that m*(S) < e. 
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Now suppose that m*(S) = 0. We shall construct functions f, satisfy- 
ing (i)-(iv). For each j there is a sequence of boxes {87} such that 


SoU Bi 
n=1 
3) m(B) < 2:1. 
n=1 


For each pair (/, n) choose a non-negative function f,, € C,(R*) such that 
Sin Σ 1 on Bi and Ϊ fin 33 2m(B/). (The verification that there is such ἃ 


function f;, is quite easy and has been left to the exercises.) Then the 
sequence of functions we seek is 
7, oe > PEE 


Certainly f, € C,(R*), and f; <f, <f; < +--+ because each /f;, is non- 
negative. As for condition (111), we have 


Moreover, since f;, > 1 on Bi the value of f,(X) is at least as great as the 
number of pairs (j, n) such that 7) - n< pand X ε Bi. If X ε S, then, 
for every j, there is an n such that X € Bi. The point is that, if X¥ <€ S, 
then Y εἰ B/ for infinitely many pairs (/, 2); hence, 


lim f,(X) = οο. 


Definition. Any phenomenon (associated with points in R*) which hap- 
pens except on a set of measure zero is said to happen almost everywhere 
(frequently abbreviated a.e.) or at almost every point. 


The last theorem states that if f; <f, <f,; 33 --- 1s a monotone- 
increasing sequence of (real-valued) functions in C,(R*) such that the 


integrals J f, are bounded, then the limit lim /,(X) is finite at almost every 


point, 
lim f,(X) < οο, a.e. 


As another illustration of the use of the term “almost everywhere”, 
consider the function on R! defined by 
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0, if xis irrational or x ~ 0 
xy = : αν εν 
70) > if x = re a fraction in “lowest terms”. 


You will recall that fis continuous at 0 and at each irrational point. Thus 
fis continuous almost everywhere. In fact, fis discontinuous at each non- 
zero rational point, but that has no bearing on the validity of the last 
sentence. Each continuous function on R' is also continuous almost 
everywhere. For a third illustration, consider the Cantor function (Exam- 
ple 24 of Chapter 3). This is a continuous function on [0, 1] which is 
differentiable at each point of [0, 1] not in the Cantor set K. Since K is a 
set of measure zero, the Cantor function is differentiable almost every- 
where on [0, 1]. Incidentally, we will prove later that a complex-valued 
function on the interval [a, ὁ] is Riemann-integrable if and only if it is (a) 
bounded, and (b) continuous almost everywhere. 


Exercises 


1. Let B be aclosed box in R*. Show that there exists a non-negative fin C,(R*) 
such that ΛῈΣ 1 on Band | f< 2m(B). (Hint: Refer to Exercise 1 of Section 7.2.) 
2. If S is a subset of R’, then 
m*(S) = inf >) m(B;) 
k 
where the infimum is taken over all countable coverings of S by open boxes. 
(Hint: The inf obtained by using only open boxes could only be larger than 


m*(S). If {B,} is any sequence of boxes, we can fatten up B, to an open box 
without increasing its measure by more than €2-*.) 


3. If K is a compact subset of R”, then 
N 
m*(K) = inf p> m(B,) 
=1 


where the infimum is taken over all finite (open) covers of K by boxes. 
4. If B is a closed box, then m*(B) = m(B). 
5. If Bis a box, the boundary of B is a set of outer measure zero. 
6. If B is a box, then m*(B) = m(B). 
7. If B,,..., B, are boxes contained in the open set U, and if the interiors 
{,-.., B, are pairwise disjoint, then 


Σ, m(B,) < m*(U). 


(Hint: It suffices by shrinking, to prove it when B,,..., B, are pairwise disjoint 
closed boxes.) 
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8. Let U be an open subset of the real line R!. Then (you are supposed to know 
that) U is the union of a countable collection of pairwise disjoint open intervals: 


U=VUik, LT; 1,= @, tf. 
k 
Show that 
m*(U) = Σ γπ(1,). 
(Assume that U is bounded, if you wish.) 


9. Show that the circle S! = {(x, y); x? + y? = 1} is a set of (2-dimensional) 
measure zero. 


10. Find a compact set K in R! such that 


(a) the interior of K is empty; 
(b) m*(K) > 0. 


(Hint: Construct K as you would the Cantor set, but take out smaller open 
intervals.) 


*11. Let w be a complex-valued function (operator) on C,(R*) with these prop- 
erties. 


(i) y is linear, ψίῳ + 8) = cw(f) + Wg). 
(ii) If Bis a closed box and f = 0 outside B, then 


WCF) |< m(B)II Fl. 
(iii) If f ε C.CR*) is non-negative, then 
w(f) = m(B) int f 
for every closed box B. 
Prove that y(f) = { Sf for every f € C,(R*). 


7.4. The Principal Propositions 


We now proceed with the tasks (i) to assign a limit (function) to each 
sequence in C,(R*) which is (fast) Cauchy relative to the L'-norm, (ii) 
to assign an integral to each such limit function, (iii) to verify that the 
resulting integral is linear, positive, etc. on the space of “integrable” 
functions. 


Theorem 3. If {f,} is a sequence of continuous functions of compact 
support which is a fast Cauchy sequence in the L'-norm, 


(7.16) Σ, | fess — fal < 00 
n=1 
then {f,} converges pointwise almost everywhere on R*. 


Proof. Let 
En = p> Fiat a=. Fy: 
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Then the functions g, satisfy conditions (i)-(iv) of Theorem 2. Therefore 
lim g,(X) < οο, a.e. 

that is, 

(7.17) Σ fari(X) -- ΠΛ] -- co, 8... 


n=1 


At every point X where (7.17) holds, { f,(X)} is a fast Cauchy sequence of 
complex numbers and, therefore, converges. 


Definition. A Lebesgue-integrable function on R* is a function f such 
that 


(i) f is a complex-valued function defined almost everywhere on ἘΝ; 
(ii) there is a fast L'-Cauchy sequence {f,} in ΓΟ, (ΕΒ) such that 


f(X) = lim f,(X), a.e. 
We have just seen that each fast L'-Cauchy sequence { f,} in C,(R*) 


converges pointwise almost everywhere to a Lebesgue-integrable function 
Ff. We would like to define 


| f= lim { f, 


Certainly the sequence { fi} converges, because it is a fast Cauchy se- 
quence of numbers: 


> fae ve [, => eae — f,|< οο. 


What we must show now is that the limit 
lim [1 
depends only on the limit function f and not on the particular sequence 


{ f,}| which we use to approximate Κὶ In other words, we must show that if 
{ f,} and {g,} are fast L1-Cauchy sequences in C,(R*) such that 


lim f,(X) = lim g,CX), a.e. 


then 
lim { f, = lim | 4. 
Here is the key result. 


Theorem 4. Let {f,} and {g,} be two monotone-increasing sequences of 
non-negative continuous functions of compact support. If 
lim f,(X) > lim g,(X), a.e. 
then “ " 
lim Ϊ Γ. "5 lim [ Zn: 
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Proof. Let 
f=imf, 
g = lim g,. 
Since ἢ <f, <f;<--- and g, <9g,< 9; <---, these limits exist at 
every point, provided we allow + oo as a value. Since f(X) > g(X) 


almost everywhere, Theorem 2 tells us that there is a sequence of non- 
negative functions in C,(R*) such that 


(Ὁ yy τ ἂς ΞΞ ἢ. 3... 
(ii) [h, <1, for every n; 
(11) lim 4,(X) = οο, if f(X) < g(X). 
Let € > 0. Condition (iit) tells us that 
(7.18) lim [f,(X) + €4,(X)] > 9(X), = all X © Κ'. 


Now fix a positive integer p. Since g > g,, we have 
(7.19) lim [f,(X) + €h,(X)] > g(X), all X © R*. 
Let B, be a closed box such that g, = 0 outside B,. The sequence of 
functions 
Φ, ΞΞ Max (0, 8, =f, = €h,) 
converges monotonely downward to 0. Since each g, is continuous and 


B, is compact, this convergence to 0 must be uniform on B, (Dini’s theorem, 
Theorem 1 of Chapter 5). Thus for some N we have 
ε 
QMX) < m(B,)’ Xe B, 
that is 


€ 
IhX) + εἰν Χ) = 844) — Tos χε B,. 
From this it follows that 


[fvte=ftvte] hy 
ἘΞ Ι, (fy + €hy) 


z= Ep € 
Bp 


== [4, — ξ. 
Accordingly, 
(7.20) lim | ΠῚ Ι 5, -- 2ε. 
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Since (7.20) holds for every p and €, we have 
lim { f, > lim { δ, 


Theorem 5. Let {f,} and {g,} be fast L'-Cauchy sequences of real- 
valued continuous functions of compact support. If 


lim f,(X) > lim g,(X), a.e., 
then ᾿ ᾿ 
lim i f, > lim i La 
Proof. Define fy = ρὺ = 0. Then let 
%, = max (0,f, — fr-1 — 8 + 85-1) 
B, = max (0, g, — 81-1 — 5, + Sn-1) 


so that 
a, — β, Ξε ἢ, hex — 85 ἢ 5,- 


ps (a, — β) ΞΞ ὕ, — Su 


The sequences 


then satisfy the hypotheses of Theorem 4. Thus 


0 <lim | (F, —G,) 
= lim | ¥ @, -- B) 
= lim | (f, — 5,). 


Corollary. If {f,} and {g,} are fast L'-Cauchy sequences inC,(R*) such 
that 
lim f,(X) = lim g,(X), a.e., 


then 
lim f, = lim { Ln. 


Proof. Apply Theorem 5 separately to the sequences of real parts of 
f, and g, and to the sequences of imaginary parts. 
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Definition. If f is a Lebesgue-integrable function, the (Lebesgue) inte- 


gral of f is 
[f=1im | f, 


where {f,} is any fast L'-Cauchy sequence in C,(R*) which converges to f 
pointwise almost everywhere. 


It is (of course) the last corollary which ensures that the integral is 
well-defined. Let us summarize the elementary properties of integration 
and integrable functions. 


Theorem 6. Let f be a complex-valued function defined almost every- 
where on RX. 


(i) If f is Lebesgue-integrable, then (cf + g) is Lebesgue-integrable 
for every complex number c and every Lebesgue-integrable function g, and 


[9 τ ο[{-ὸ [se 


(ii) Jf u = Re (ἢ) and v = Im ([), that is, if f = u-+ iv where u and 
v are real-valued, then f is Lebesgue-integrable if and only if both u and v 
are Lebesgue-integrable. 

(iii) If f is Lebesgue-integrable, then |\f | is Lebesgue-integrable and 


fe] < fie 


(iv) If f is Lebesgue-integrable and f{(X) > 0 almost everywhere, then 
{feo. 


Proof. We have left the proofs of the various statements to the 
exercises. 


Let us close this section by asking ourselves what concrete examples 
we know of Lebesgue-integrable functions, other than continuous func- 
tions of compact support. 


EXAMPLE 3. The most important device we know for “constructing” 
Lebesgue-integrable functions from functions in C,(R*) is provided by 
Theorem 2: If f; <f, <f3 <-+-+ is a monotone-increasing sequence of 
continuous functions of compact support and the sequence of integrals { Ϊ fi} 
is bounded, then 

f(X) = limf,(X) < 9, a.e., 


I is a Lebesgue-integrable function, and 
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[f= lim [7 


Of course, the same is true for decreasing sequences. It is a fact (which we 
shall not stop to prove) that every real-valued Lebesgue-integrable function 
fisasum f= Κι, +f, +/3, where /, is the limit of a monotone-increasing 
sequence in C,(R*), f, is the limit of a decreasing sequence in C,(R*), and 
7. = 0 almost everywhere. At this point, let us note some things we can 
do easily with the monotone convergence device. 
Let Καὶ be any compact subset of R*. There exists a functionh € C,(R*) 

such that 

h= 1, on K 
(7.21) 

0O<h<l, off. Καὶ 


(See Exercise 2, Section 7.2) The sequence of powersh>h*? >h3 >.--- 
is decreasing and converges pointwise (everywhere) to the characteristic 
function of K: 

Ι, χε κ 


0, Χᾷκ. 


By the monotone convergence property, we see that the characteristic 
function of any compact set is Lebesgue-integrable. (The boundedness of 
the sequence of integrals is trivially satisfied in this case because 0 < h” 
<h.) 

By essentially the same argument one can show that, iff is any func- 
tion in C,(R*) and Καὶ is any compact set, then the restricted function 


f onk 
0, off Καὶ 


is Lebesgue-integrable. By taking real and imaginary parts and writing 
real-valued functions in C,(R*) as differences of non-negative functions 
in C,(R*), it is enough to verify this result when f > 0. In this case 


kf = lim hf 


lim A(X)" = k,(X) = 


kf =| 


where ἢ is a function satisfying (7.21); and the convergence is monotone. 
Now, it is a fact that, if g is any continuous complex-valued function on a 
compact set K, there is an f in C,(R*) such that f(X) = g(X), X © K. 
(See Exercise 10, Section 7.2.) Therefore, if g is any continuous function on 
the compact set K, the function 


7.22) f gone 
(ς 7 0, off K 


is Lebesgue-integrable. Of course, we also know how to calculate the inte- 
eral of f, using the function / (7.21) and any extension of g to a function 
in C,(R*). 
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Since the integrals of continuous functions of compact support were 
defined using Riemann integrals on boxes, it should be clear that in case 
K is a closed box, the Lebesgue integral of the function f (7.22) over R* is 
equal to the Riemann integral of g over the box K. By taking linear combina- 
tions (sums), 


= Bke,+t-::: + g,Kz, 


we see that the same result is valid when g is a piecewise-continuous func- 
tion on a box. We have given (in the text, as distinguished from the 
exercises) a complete proof when g is a step function, i.e., a linear combina- 
tion of characteristic functions of boxes: 


B= Kg τ Kgs 


We can also analyze the integrability of continuous functions which 
are not of compact support. Let f be a continuous complex-valued func- 
tion on R*. When is f Lebesgue-integrable? Suppose first that f > 0. We 
do the obvious thing. We choose an increasing sequence of closed boxes 
B, which exhaust R*: 


UC) B= RE 
and examine the increasing sequence of integrals 
105 of ate, 
Bi Be 

In order to be careful, let us assume that B, is contained in the interior of 
Biwi, ἢ = 1, 2,3,.... As in Exercise 2, Section 7.2, choose ἡ, € C,(R*) 
so that 

A(X)=1, XEB, 

0<h(X) <1, X © B,, — B, 

h,{X) = 0, X ¢ Brat 

Then 


hf<hf<hf<-:-:: 
lim h,(X) f(X) = f(X), ‘for all X 


[Lye ΓΞ 7: 


Bn+i 
Thus, the sequence of integrals { hf } is bounded if and only if the 
sequence [ f is bounded. When these sequences are bounded, the 
8 


π 


non-negative function f will be integrable. What does this tell us when f 
is complex-valued? We know that, if fis integrable, then | f | is integrable 
and (by what we just did) the integrals of | f| over the various boxes B, 
must be bounded. Write f= u + iv where uw and v are real-valued. Since 
jul <|f|and|v|<|/|, we know that |u| and |v| will be integrable if | f| 
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is. Furthermore u = u* — u~, where u* = max (u, 0), μ΄ = —min (yu, 0). 
These are continuous non-negative functions and |u| = u* + μ΄. There- 
fore, by the test we just gave for the integrability of non-negative continu- 
ous functions, if |u| is integrable, then μ΄ and μ΄ are integrable and hence 
u is integrable. Similar comments apply to v. We conclude that (since f 
is continuous) f is integrable if and only if | f| is. So, if f is a continuous 
complex-valued function on R*, then f is Lebesgue-integrable if and only if 


lim | fl <e 


where {B,} is any increasirg sequence of closed boxes which exhaust R*. 


Exercises 


1. If fand g are Lebesgue-integrable functions and c is a complex number, then 
(cf + g) is a Lebesque-integrable function and { (f+eg)=c | f+ { ¢. 


2. If fis a Lebesgue-integrable function, then the real part, the imaginary part, 
and the absolute value of f are Lebesque-integrable functions. Furthermore 


If rls fire 
3. The “Z!-norm” 


fl = {1 


is a semi-norm on the space of Lebesque-integrable functions. 
4. If fis a Lebesque-integrable function and [ | f| =0, then f(X) = 0 almost 
everywhere. 
5. If fis any function on R* which is [has the value] 0 almost everywhere, then 
f is Lebesque-integrable and [ f=0. 
6. If fis a continuous function of compact support, then fis Lebesgue-integrable 
and the Lebesque integral of fis equal to the Riemann integral of ἢ 
7. If fis a Lebesque-integrable function and f(X) > 0 almost everywhere, then 
[{ΠΞ 9. 
8. Let fand g be real-valued Lebesgue-integrable functions. Let 
| (Cf V gX) = max (f(X), e(X)) 
Cf A ghX) = min (fCX), g(X)). 


Then f V g and f A g are Lebesque-integrable functions. (Hint: The max and 
min of two continuous functions are continuous.) 


9. Prove that the function 
F(x) = ετῦῦι! 


is Lebesgue-integrable on the real line and find its integral. 
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10. Prove that the function 
1 
f(x) = ἜΣ: χΞεΕῦ 


is not Lebesgue-integrable on the real line but that the function 
x74, Δ: 1 

0, [χΧ} -Ἠἰἰ 

is Lebesgue-integrable. Replace x2 by |x|~!/2 and what happens? 
11. The function 


g(x) = 


i 
f(x) = [= 


0, otherwise 


0<|x|<1 


is not Lebesgue-integrable on R!. The function defined on the complex numbers 
by 

1 

—, 0O<z<1 

f(z) = | Ζ 

0, otherwise 
is Lebesgue-integrable on R?. 
12. True or false? Every integrable function agrees almost everywhere with 
some function which is nowhere continuous. 


7.5. Completeness and Continuity 


We have extended the integral from C,(R*) to a much larger space. 
In this short section we shall first verify that we have accomplished our 
immediate aim of completing C,(R*) relative to the L'-norm, and then we 
shall discuss the relationship between integrability and continuity. 


Notation. We denote by L! = L'(R*) the space of (complex-valued) 
Lebesgue-integrable functions on R*. 


Theorem 7. Under the semi-norm 


fl = fie! 
L!(R*) is a complete semi-normed linear space, which contains C,(R*) as a 
dense subspace. 


Proof. If f € L', we have a fast L!-Cauchy sequence { f,} in C,(R*) 
such that 


lim f,(X) = f(X), ae. 
How do we know that lim || f — f,||, = 0? Fix a positive integer p. Then 
{| f, —f,|} is a fast L-'Cauchy sequence in C,(R*) which converges point- 
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wise almost everywhere to |f — f,|. Thus 
If—Solh = JIS Ξ Fo 


= lim | Lf, — Sol 
But { f,} is L'-Cauchy; in particular 
lim [Ifa — Soll: = 05 
hence, 


lim If —-fplh = "πὶ lim fn — Soll 
Ξε ἢ) 


We want to show that each sequence in L! which is Cauchy relative 
to the L!-norm (the L!-semi-norm, to be precise) converges in L!-norm to 
an integrable function. It is enough to show that each fast L1-~Cauchy 
sequence in L! converges. If { f,,} is such a sequence: 


ΣΤ — fall << 
we can choose (for each n) a function g, € C,(R*) such that 


In — Salli ey a 


Then {g,} is a fast L'-Cauchy sequence in C,(R“*). By the reasoning 
in (i), there exists f< 1 such that lim|| f— g,||, =0. Plainly then 


lim | f — fall: = 0. 


Definition. A null function is a complex-valued function which 


(a) is defined almost everywhere on R*; 
(Ὁ) has the value 0 almost everywhere on R*. 


In the exercises of the previous section it was pointed out that the null 
functions are precisely the Lebesgue-integrable functions f for which 


{| f| =. Thus the set of null functions is the subspace of L1 consisting 


of the functions of L!-semi-norm zero. As we explained in Chapter 6, we 
can (therefore) form the quotient space L/N where N is the space of null 
functions, and this will be a Banach space which contains C,(R*) as a 
dense linear subspace. Rather than carry out such a formal construction 
it is customary to say that L' is a Banach space (complete, normed linear 
space) provided we agree to identify any two functions which differ by a 
null function. This amounts to adopting the following. 


Convention. lf f, g are functions in L', then f= g will mean that 
T(X) = g(X), ae., that is, that f— g is a null function. 
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Note that with this convention it does follow that f= 0 when || /'||, 
= 0. We should also remark that f> g will mean f(Y) > g(X) almost 
everywhere, etc. Where there is any chance for confusion, we will reinsert 
the term “almost everywhere”. 


Now we turn our attention to the question: In passing from C,(R*) 
to L'(R*), how far away from continuity have we moved? How dis- 
continuous can an integrable function be? One answer is, of course, that 
an integrable function can be nowhere continuous, because we can alter 
the values on a countable dense set in any way whatsoever, without affect- 
ing integrability (or the integral) of the function. On the other hand, we 
know that any integrable function can be approximated by continuous 
functions via pointwise convergence. This approximation can be strength- 
ened to reveal that each integrable function displays a surprising amount 
of continuity, if we know where to look for it. 


Theorem 8. Let f be a Lebesgue-integrable function on R*. If € and 6 
are any positive numbers, there exist a set ὃ and a continuous function g of 
compact support such that 

m*(S) - ὃ 
(7.23) 
|f(X) — g(X)| - ε, Χ € 5. 

Proof. The proof is quite similar to one we have given before. We 

begin by choosing a sequence { f,,} in C,(R*) such that 


1 
eet Tall 4: 


lim f,(X) = f(X), a.e. 
For each ἢ, let 


δ, = {X3 ῳ..(Χ) —f,(X)| > 2:5} 
By (7.12) 


[ \feer τ Δ ΠΣ 2: m*(S,) 


and since || f,.1 —Sall, < 4°", we have 


m*(S,) < 27". 
For each positive integer N define 
En = ι S,: 
n=N 


Note that 
m*(Ey) < Σ m*(S,) 
n=N 


<2 


n=N 
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so that 
N 


Now, given € > 0 and 6 > 0, choose a positive integer N such that 


lfna(X) -- ΙΧ <2, nn. 
Therefore the sequence { f,} converges uniformly on the complement of 
Ey. Let T be the set of points X such that { f,()} does not converge to 
F(X). Then m*(T) = 0, and if 
we have m*(S) < ὃ and lim f,(X) = f(X), uniformly on the complement 


of δ. The function g in the theorem can be taken to be any f, such that 
|f —f,|< e€ on the complement of S. We should remark that S can 
always be replaced by a slightly larger open set having the same properties. 
(See next corollary.) | 


Corollary. If ἰδ a Lebesgue-integrable function on R*, there exists a 
decreasing sequence of open sets {U,} such that 

(i) lim m*(U,) = 0; 

(ii) for each n, the restriction of f to the complementary closed set 
K, = R* — U, is a continuous function on K,. 


Proof. This result is a corollary of the proof of Theorem 8 rather 
than of its statement. In the proof we have the decreasing sequence of sets 
{T U Ey}, the outer measures of which tend to 0: 


(7.24) m*(T U Ey) < 27-7», 


We can cover T U ἔν by a sequence of open boxes, the sum of whose 
measures is less than 2. 1), (See Exercise 2, Section 7.3.) Let Vx be the 
union of those open boxes, so that Vy is an open set and m*(Vy) < 2. τὺ] 
Let Oe V, C) V, (Ὰ πῶ () Vx Then 

Uy is open 

TJ by CU 

m*(Uy) a 2-(N-1) 

U, τῷ Us. > > acm 
Furthermore, for each N, { f,,} converges to f uniformly on the complement 
Ky = Καὶ — Uy. Thus the restriction of f to Ky is a continuous function 
on K, (Theorem 2, Chapter 5). 

The theorem states that if f is integrable, then, given any ε and ὃ, 


there is a continuous function of compact support which is uniformly 
within ε of f; except on a set of outer measure less than 6. Note what the 
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first corollary does not say. It does not say that, given 6 > 0, there is a 
closed set K with m*(R* — K) < 6 such that fis continuous at each point 
of Καὶ We know that fneed not be continuous at any point. It does say that 
if we are willing to totally disregard the values of fon an open set of small 
measure, then we obtain a continuous function on the complement of 
that set. | 


As we indicated earlier a Riemann-integrable function comes much 
closer to being continuous. Every such function is continuous almost 
everywhere. We could prove this now. In order to avoid repetition, we will 
defer the proof until the next section. 


Exercises 


1. (Translation-invariance of the integral) If fis a function on R* and Yisa 
vector in R*, the Y-translate of fis the function 7} f defined by 


(Ty f)(X) = f(X + Y). 
Prove that if f ε L'(R*), then (77 Ὁ ε L'(R*) and 


Ϊ Ty τ {Fs 
2. Prove that if f ε L'!(R*) 
lim || f — Tyf ||, = 0. 
Y=0 


Hence, for a fixed f, the map 
RE —> L(R*) 
T( Y) = Ty f 


is continuous. (Hint: Prove lim 7yf=/ first for functions in C,(R*), and 
approximate.) x 


3. Let f be an integrable function on the real line. Let 
FQ) = [Κι ναι, χε ΑΚ. 


Prove that F is a continuous function on R!. Prove that, at any point x where 
f is continuous, the function F is differentiable and F’(x) = f(x). 


4. Use Theorem 8 to show that every integrable function fcan be expanded ina 
series 


f= Σ, Sn th 
where /;, /2, /3, ... are continuous functions of compact support and ἢ is a 


null function. 


5. If fis a non-negative integrable function, show that the series expansion in 
Exercise 4 cannot necessarily be done with non-negative functions f,. 
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7.6. The Convergence Theorems 


Now we present the several basic convergence theorems which make 
the Lebesgue integral such a powerful tool. First we summarize for 
sequences of integrable functions the results which we exploited for 
sequences of continuous functions of compact support. 


Theorem 9. If {f,} is a sequence of (Lebesgue) integrable functions 
such that 
(i) {f,} is L1-Cauchy; 
(1) the limit 
f(X) = lim f,(X) 


exists almost everywhere; 
then the function f is integrable and {f,} converges to f in L'-norm. 


Proof. The first statement need only be proved when {/,} is a fast L!- 
Cauchy sequence. Thus we begin with such a sequence: 


Dl fest — fall < oe. 


We will show that it converges almost everywhere to an integrable function 
7 and that || f —f,||, —- 0. We use Theorem 8 to find functions g, € 
C.(R*) and sets δ, with these properties: 


(a) Ifa — Salli « π΄ 2; 

(b) m*(S,) <n-?; 

(c) |f.(X) — el X)|<n', XES,. 
From (a) and the fact that { f,} is a fast L'-Cauchy sequence, we conclude 
that {g,} is a fast L'-Cauchy sequence in C,(R*). By Theorem 3, the limit 


F(X) = lim g,(X) 


exists almost everywhere, fis an integrable function, and {g,} converges to 
fin L'-norm. By condition (a), we see that { f,} also converges to fin L!- 
norm. So, all we have to show is that f,(X) converges to f(X) almost 
everywhere. 

For each N, let 


Ey= JS. 
n=N 
Let S and T be, respectively, the sets on which { f,(X)} and {g,(X)} do not 
converge to f(X). Condition (c) guarantees that 
SoTWEy, N12 364 
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because, if X ¢€(TUEy), then {g,(X)} converges to f(X) and 
{f,(X) — g,(X)} converges to 0. Since 


m*(S) < m*(T) + m*(Ey) 

m*(T) = 0 

m*(Ey) < 3 m*(S,) << Yn? 
we have i = 

m*(S)< ¥in-2, ἴογ all N. 
Thus m*(S) = 0. ia 


Theorem 10 (Monotone Convergence Theorem). Let {f,,} be a sequence 
of real-valued integrable functions on R* which is monotone-increasing: 


f,<f,<f,;<---. 
If the sequence of integrals { [ f,} is bounded, then the limit 
{(X) = lim f,(X) 
is finite almost everywhere, f is an integrable function, and 
{ f =1im [ f,. 
Proof. Choose ἢ, <n, <n, <--- so that 
ibe = lim " -- 2.1, 


Then fi, < fn, 33 ++: is a fast L'-Cauchy sequence. So the result follows 
immediately from application of Theorem 9 to this subsequence. 


Corollary (Beppo Levi’s Theorem). Let {h,} be a sequence of non- 
negative integrable functions and suppose that 


Σ | nae: 
n=1 
Then 
h(X) = ¥h,(X) <0, 86. 
n=1 


ἢ is an integrable function, and 


Jh= 3 fh 


Corollary (Weak Continuity of Integration). If f, > f£, > f;>---> is 
a monotone-decreasing sequence of non-negative integrable functions and 


lim f,(X) = 0, a.e. 
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then 
lim { f, = 0. 


The following result, appeared in the 1906 doctoral dissertation of 
Pierre de Fatou. In the framework we have adopted, it is a most significant 
piece of information. 


Theorem 11 (Fatou’s Lemma). Let {f,} be a sequence of non-negative 
integrable functions and suppose that the limit 


f(X) = lim f,(X) 


exists almost everywhere and that the sequence of integrals 


[- 


is bounded. Then f is an integrable function and 
(7.25) [ f < lim inf [ ἫΝ 


Proof. If {x,}is a sequence of real numbers, let us recall the definition 
of 
lim inf x,,. 


This is the smallest limit point of the sequence. In other words, it is the 
smallest number which is the limit of some convergent subsequence of 
{x,}. Another way to define the lim inf is this: For each n let 


ay = inf x,,. 
n>N 


Then 

4S a,<43;<::: 
and so {a,} converges in the extended real number system and that defines 
the lim inf: 


lim inf x, = lim ay. 
n N 


Since { f,} converges almost everywhere to f, we have 
F(X) = lim inf fC), a.e. 


Why ? Because, if a sequence converges, it converges to its lim inf (= its 
lim sup). Define 
&n(X) = inf f,(X) 
n>N 
so that 
Bi ye 85 Ξ τος 
and 
f(X) = lim g,(X), a.e. 
N 
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Now, we’ll show that each g, is an integrable function. Assume, for the 
moment, that we have shown that. Since gy < fy, we have 


| gv [fu 


One of the hypotheses of this theorem is that the integrals of the fy’s are 
bounded. So, the integrals of the g,’s are bounded. Apply the monotone 
convergence theorem to the sequence {g,}. Conclusion: The function Κ is 
integrable and 


{ f= lim [ ὅν. 
We shall be finished if we prove that 


(1) each gy 15 integrable; 
(ii) lim [ gy < lim inf { f,. 
N n 


Fix an N. Let 
A®™) = fn [re N Suan 
aS min (fn. oo SENG): 
Then 
hi) > A =. hy = ites 
and 


lim AW) = inf f, 
no>N 


= ὅν: 


Each ΜΙΝ) is integrable (Exercise 8, Section 7.4). By the monotone conver- 
gence theorem, gy 15 integrable and 


[ gy -- lim [ Ai”, 
Now 
A” <f,  f=N,...,N+n 
so that 


fam =< [2 JHN,...,N+n. 
In other words, 


[as < min {| fo a) [Δ 


Consequently, 


| gy = lim [ A®) < inf [ ie 


J2N 


Now let JN get large: 


[f= lim | gy <tim int | f, 
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Theorem 12 (Dominated Convergence Theorem). Let {f,,} be a sequence 
of integrable functions which converges pointwise almost everywhere to 
a function f. Suppose that there is an integrable function g such that 
lf, |< |g|for every n. Then f is an integrable function and 


[f= {πὶ | f,. 


Proof. We may as well assume g > 0, since we can always replace g 
by |g|. Furthermore, we may as well assume the functions /, are real- 
valued, for if f, = u, + iv,, then the sequences {u,} and {v,} satisfy the same 
conditions as does { f,}. 

Now we have 


—g<fi<g, Ke PD oten 3 
Since g and f, are integrable, g — f, is integrable. Furthermore, 
5 ἘΞ fe - 0 


[«.---29.Ξ2[. 
Also 
lim (9(Χ) — fi(X)) = a(X) — SO) 


almost everywhere. By Fatou’s lemma, g — fis an integrable function and 
[( -- A) <lim int [ — f,. 
Thus 715 an integrable function and 
[7Ξ lim sup Ge 
We have used the fact that 
lim inf (—x,) = — lim sup x,. 
Now apply Fatou’s lemma to the sequence g + ἢ, and you obtain 
[033 lim int | f, 
Conclusion: 
lim sup | f, < | f<lim int | f, 
so that the sequence { fal converges and 


| f= lim | F,. 


Corollary (Bounded Convergence Theorem). Let {f,} be a bounded 
sequence of integrable functions which converges pointwise almost everywhere 
to the function f. If there is a set of finite (outer) measure outside of which 
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every f, vanishes, then f is integrable and 


{f= lim {τ 
Proof. Suppose m*(S) < oo and, for every n, f, = 0 outside S. 
Then there isan open set Usuch that S c Uand m*(U) < oo. (Take U to 
be the union of a sequence of open boxes which cover S and have mea- 


sures totalling m*(S) + ε.) Then k,, the characteristic function of U, is 
an integrable function. (Why? Because U = [ἰ K, where K, is an increas- 


ing sequence of compact sets. Apply the monotone convergence theorem.) 
If | f,.| < M for all n, the function g = Mk, may be used in the dominated 
convergence theorem. 


The dominated convergence theorem gives just about the best possible 
result enabling us to conclude that 


| dim f,) = lim | f,. 


One must not be deluded into thinking that the interchange of the integral 
and the limit of the sequence is always possible using the Lebesgue integral. 
Some control over the convergence, e.g., as provided by the dominating 
function g, is necessary. Consider the sequence of functions of R! defined 
by 

lL k<x<kil 

SAX) = 

0, otherwise. 

Then { f,} converges pointwise (boundedly) to 0, but 


{f=1, for all n. 


As an illustration of the utility of the monotone convergence theorem, 
let us prove the following. Although it is called a theorem, it should be 
taken in the spirit of an example. 


Theorem 13. A complex-valued function f on the interval [a, Ὁ] is 
Riemann-integrable if and only if 


(a) it is bounded; 
(Ὁ) it is continuous at almost every point of [a, Ὁ]. 


If f is Riemann-integrable, then the function 


_ ἢ on [a, Ὁ] 
~ |0 elsewhere 


b 
is Lebesgue-integrable on R' and Ϊ ῬῸΞΞ Ι f(x) dx. 
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Proof. Obviously we may assume that f is real-valued. We know that 
every Riemann-integrable function is bounded. Hence, our task is to show 
that a bounded, real-valued function f on [a, δ] is Riemann-integrable if 
and only if it is continuous at almost every point of [a, δ]. 

First, suppose fis Riemann-integrable. This means that the Riemann 
sums 


(726 SPT) = SAN) — Xj), ty © Bey χη 


converge to | ° F(x) dx as the mesh 
) P| = max (x, — αν) 
goes to 0. Or, as we reformulated it, it means that 
lim diam δ᾽, (797) = 0 


where Σὺ (/) is the set of all sums (7.26) which arise from partitions P 
with || P|| < δ. For any given partition P we have 


(7.27) Σ mx; — X;-1)<S(f, P, T) < Σ MAx;—~ x,-1) 


where 
M,= sup f(t) 


Jj 


té[x5-1, x3] 
m,= inf f(t). 
rE [xj-1, x5] 
Furthermore by appropriate choice of T= (t,,...,¢,) we can make 


S(f, P, T) as close as we wish to the sum on the left or the sum on the right. 
These sums depend only on the partition P. Let us denote the lower sum 
(the one using m,’s) by S(f, P) and the upper sum (the one using M,’s) by 
Ὁ 2 P). Obviously, then, 


diam Yi(f) = sup [S(f, P) — SCF, P)]. 
Accordingly, since fis Riemann-integrable, we have 


IIPI|-+0 
Choose a sequence of partitions {P,} such that 


(a) for each ἡ, P,,, is a refinement of P,; 
(Ὁ) lim || P, || = 0; 


(c) lim [S(f, P,) — δύ P,)] = 0. 
Now 
SUF, P,) = [ ug(x) dx 


SUF, P) = f(x) ax 
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where u, is the step function which on each subinterval J defined by P, 
takes on the constant value sup f, and », is the corresponding step function 
I 


using inf f on each subinterval. Evidently u, > v, for each n and, because 
I 

of the refinement condition (a), we have u, > u,41, Un << Unei1- Thus, 

(7.28) VU, <n SVG Sf Sy la, ΞΞ μι. 


Since u, is a step function, we know that if we extend it to be 0 outside 
[α, Ὁ] we obtain a Lebesgue-integrable function on R'. Let us call the 
extended function u, as well. Do the same for each v,. Then we also know 
(Example 3) that 


fu, =f uo) dx = SU, Pr) 
fr, = [lax = SUF, P,). 


Therefore, the Riemann-integrability, condition (b), tells us that 
lim i (u, — v,) = 0. 


By (7.28), {(u, — v,)} is a monotone-decreasing sequence of non-negative 
integrable functions. Since their integrals tend to 0, the monotone con- 
vergence theorem (a corollary of it) tells us that 


lim [u,,(x) — v,(x)] = 9, a.e. 


Now we assert that fis continuous at each point x such that u,(x) — v,(x) 
converges to 0. First, consider a point x which is not an endpoint of any 
of the subintervals defined by any of the partitions P,. If u,(x) — v,(x) < ε, 
then there is a (small) open interval about x on which μ,(χ) < f(t) < v,(x), 
i.e., there is a d > 0 such that 


|x —t| <6 implies | f(x~) — fO|<4,(%) — u,Q) <e. 


So we see that fis continuous at x. A small variation on this argument 
works even if x is one of the countable number of end points which occur. 
If you don’t want to fuss with that case, just throw this countable set 
(which has measure zero) in with the set of x’s such that u,(x) — v,(x) 
does not converge to 0, and you will see that we have already proved that 
fis continuous almost everywhere. 

The argument we have just given is essentially reversible. Suppose f 
is known to be continuous almost everywhere. Choose a sequence of 
partitions which satisfy conditions (a) and (b) as before. If x is any point 
where fis continuous (and x is not an endpoint of any of the subintervals), 
then 

lim [u,(x) — v,(x)] = 0. 


Therefore the sequence {(u, — v,)} converges monotonely to a function 
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which is 0 almost everywhere. Consequently, 


lim Ϊ (u, — v,) = 0. 
(Which theorem do you need to use to conclude this?) So we have 


lim [δ P,) — SCF, P,)] = 0 


and this holds for any sequence of partitions {P,} which satisfies conditions 
(a) and (b). It should be apparent that fis Riemann-integrable. 


Exercises 


1. If S is a set of finite outer measure, then 
m*(S) = inf {{pfe L', f>0, f>1on 5]. 
2. If g is a bounded continuous function and fis an integrable function, then 
fg is integrable. 


3. (Continuity of the integral with respect to a parameter) Let f(X, Y) be a 
function which is continuous as a function of Y and integrable as a function of 
X (for each fixed Y). Suppose that | f(X, Y)|< ¢(X) for some integrable func- 
tion g (independent of Y). Prove that 

h(Y) = (£% Y) dx 
is continuous. 


4. If  ε L'(R'‘) the Fourier transform of fis the function 


F@ = | Sede dx. 


Show that f is uniformly continuous, for each fin L!. 


5. Let f be a continuous function on the real line which is everywhere differ- 
entiable and has a bounded derivative. Prove that, if —co <<a<b - οὐ, the 
function k;,,,5;/’ is Lebesgue-integrable and . 


{κων = £0) -- £@. 
(Hint: Show that the difference quotient n[ f(x + (1/n) — f(x] converges point- 
wise boundedly to f’(x) on [a, ὁ]. 


6. The function f(x) = x? cos x7? is everywhere differentiable on the real line. 
Is its derivative integrable 7 


7. Let f be a non-negative integrable function of compact support. Prove that 
/ f is an integrable function. 


8. Let f be a bounded integrable function. Prove that f2 is an integrable func- 
tion. 
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9. Let {f,} be a sequence of integrable functions such that 
@O0o<f,<1; 
(ii) lim | f, =1; 
(iii) there is Some compact set K outside of which every /, vanishes. 
Prove that 
lim [ (i — f,)? = 


7.7. Measurable Functions 
and Measurable Sets 


In this section we shall define the concept of measurable function, then 
use it to introduce measurable sets and study the behavior of measure 
(length, area, volume, etc.) as a function on the class of measurable sets. 
This will bring into focus a great deal of what we have done in this chapter. 


Definition. A measurable function on R* is a complex-valued function 
f such that 


(i) f is defined almost everywhere on R*; 
(ii) there is a sequence of continuous functions (of compact support) 
which converges pointwise almost everywhere to f. 


We should explain right away why we put the phrase “of compact 
support” in parentheses. This is because it is relatively easy to see that we 
get the same class of “measurable” functions whether we use continuous 
functions or continuous functions of compact support. Suppose ἢ. 7.2. f3, 
... are continuous functions on R* and 


lim f,(X¥) =f(X), ae. 


Let K, - K, - Κι «- - - - be an increasing sequence of compact sets such 
that 

Br. 
For each n choose a function ἡ, € C,(R*) such that 4, = 1 on K,. Let 
g, =f,h, Then g, € C,(R*) and 


lim g,(X) = lim f,(X) 
at any point where { f,(X)} converges. This is because, given X, there exists 
an N such that X € Ky and hence 
E(X)=Sf{X), nen. 


The measurable functions are our candidates for “reasonable” or 
“non-pathological” functions. They comprise quite a large class of func- 
tions. Let us note several basic things about this class. 
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(i) If fand g are measurable functions and c is a complex number, 
then (cf + g) is a measurable function. In other words, the collection of 
measurable functions forms a linear space in the usual way. 

(ii) The function fis measurable if and only if its real and imaginary 
parts are measurable functions. 

(iii) If f is measurable, then | f| is measurable. 
(iv) If f and g are real-valued measurable functions, then 


(f V ghX) = max (f(X), 2(X)) 
(f A gXX) = min (f(X), g(X)) 
are measurable functions. 
(v) If f and g are measurable functions, then the product fg is a 


measurable function. 
(vi) Every null function is measurable. 


We have left the proofs of these six assertions to the exercises. The 
first five follow immediately from the corresponding facts about con- 
tinuous functions. Statement (vi) requires a little bit more thought (but 
not much more). 

The reader should compare very carefully the definition of measurable 
function with the definition of integrable function. Integrability appears 
to require a great deal more. With the machinery we have built up, we 
can show that integrability just amounts to “measurability plus some 
control on size”. 


Theorem 14. If £ is a measurable function, then f is integrable if and 
only if there exists an integrable function g such that |f |<|g\|. In particular, 
every bounded measurable function of compact support is integrable. 


Proof. As is frequently the case, the only non-trivial part is the “if” 
half of the theorem. For the proof, once again we work only with the case 
of real-valued functions. We are told that (real-valued) f is measurable, 
so we have real continuous functions of compact support f, such that 


S(X) = lim f,(X), a.e. 


We also have | f| < g where g is a non-negative integrable function. In 
other words, 


—g<f<zg. 
Let 


that is 


FAX), if —9(X) Ξ-. . (Χ) < a(X) 
A(X) = 48(X), if f.(X) > g(X) 
—g(X), if f,(X) < —g(X). 
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Since g is integrable (and f, is), each h, 1s integrable. Furthermore. 
lim A,(X) = f(X), a.e. 


By the dominated convergence theorem, / is integrable. 

The assertion about measurable functions which are bounded and 
have compact support follows from the observation that such a function 
is dominated by (a constant multiple of) the characteristic function of a 
compact set. 


Corollary. If f is an integrable function and g is a bounded measurable 
function, then fg is integrable. 

Proof. This follows immediately from Theorem 14 as soon as one 
notes that, since the product of two continuous functions is continuous, 
the product of two measurable functions is measurable. 


Corollary. If {f,} is a sequence of measurable functions which converges 
pointwise almost everywhere to the function f, then f is a measurable function. 

Proof. We may assume that the f,’s are real-valued. We now truncate 
the function f by chopping off both its head and its feet. Let K, < Καὶ, - 
K; c --- bean increasing sequence of compact sets which exhaust R*: 

OW ee os 
For each n let 
ἢ, =(—n) V (kx, f) A Ἢ 


that is, 
0, Χ ᾳ Κ, 
h(X) = f(X), if X © Καὶ, and —n< f(X)<n 
" n, if Χ ¢ K,and f(X)>n 


—n, if X € K,and f(X) < —n. 
Then A, is a measurable function. In fact, A, is an integrable function, 
because the sequence of functions 
—nV(k, fj) An j=1,2,3,... 


converges pointwise boundedly to 4,, and all these functions vanish outside 
K,, (bounded convergence theorem). Since ἢ, Ἑ ZL! we can find a continu- 
ous function of compact support g, and a set S, such that 


m*(S,) <n? 
Ig(X) —A(OI<—, XES, 


We assert that the sequence {g,} converges pointwise to f almost every- 
where. Of course we let 
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Εν — ι) δ τὸ 
n=N 
Let T be the set of points where fis not defined, and let 
E — () ΕἸΣ 
Ν 


Then 7 is a set of measure zero, and so is E because 
Ee Ee .. | ee ee ae 


m*(Ey) < Σ ἢ 


We can see rather easily that {g,} converges pointwise to f outside the set 
of measure zero (T U E). If X ¢ (T U E), then f(X) is some real number 
(and X is in some K,), so there is a positive integer M such that 


χε K,, "ΞΜ 
If(4)| Ξ Μ. 
Then we have 
(7.29) h,(X) = f(X), n> M. 
Since X ¢ E, there exists an N such that X¥ ¢ Ey. Accordingly, 


(7.30) \e(X) —A(X)|<+, n>M. 


It is apparent from (7.29) and (7.30) that lim g,(X) = fCX). 


Corollary. If f, <f, <f,; <---is a monotone-increasing sequence 
of real-valued measurable functions and the limit 


f(X) = lim f,(X) 
is finite almost everywhere, then f is a measurable function. 


It is customary (and quite useful) to extend the integral to all non- 
negative measurable functions. The only way such a function can fail to 
be integrable is if its values are “too big, too often”. In that case its integral 
will be + co. 


Definition. If f is a non-negative integrable function, the integral of f is 
the supremum of the integrals of all integrable functions g such that g < f: 
Ϊ f = sup | g. 
geL! 
s<f 
Note that we have not defined the integral for all measurable func- 
tions. We have done so only for the non-negative ones (and the integrable 
ones, of course.) With the definitions we have used, the following is a 
very important summary result. 
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Theorem 15. If f is a complex-valued function defined almost every- 
where on R*, the following are equivalent. 


(i) f is integrable. 
(ii) f is measurable and [ ΙΕ] es: 


(11) There exists a sequence of continuous functions of compact support 
{f,} such that 
lim f,(X) = f(X), a.e. 


sup { If, | < οο. 
Proof. Use Fatou’s lemma and the monotone convergence theorem. 


Definition. A subset of R* is called ameasurable set if its characteristic 
function is a measurable function. 


The following properties of measurable sets are virtually immediate 
from what we know about measurable functions. 


(1) If S and 7 are measurable sets, the union S U 7 and the intersec- 
tion S (ἡ T are measurable sets. 
(ii) The set S is measurable if and only if its complement Κα — δ 
is a measurable set. 
(iii) If {S,,} is a sequence of measurable sets, then the union U δ᾽, and 
the intersection ( S, are measurable sets. 


(iv) Every conpant set 1s measurable. 
(v) Every set of measure zero is a measurable set. 


What measurable sets can we identify? Since every compact set is 
measurable, every closed set is measurable, because such a set is the union 
of an increasing sequence of compact sets. Every open set is such a union 
also; hence every open set is measurable. Of course we can also see this 
because the complement of an open set is closed. So the class of mea- 
surable sets contains all closed sets (and all open sets) and any set that we 
can generate from such sets using a countable number of operations 
involving union, intersection, and complementation. In addition it con- 
tains all sets of measure zero and sets we can generate by union, intersec- 
tion, and complement, using either closed sets or sets of measure zero. 
That is about all we will try to say for now, except for the following 
illustrative example. 


EXAMPLE 4. Let U be an open set. Let us describe how to calculate its 
measure m(U). Start with a gridwork of mesh 1 as in Figure 27. To be 
precise, subdivide each coordinate axis into the non-overlapping semi- 
closed intervals [n, n + 1) form = 0, +1, +2, +3,.... This will partition 
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R* into an infinite number of non-overlapping boxes. As a first approxima- 
tion to m(U) we take 
> m(B, 


n 


where B,, B,,... are those boxes (formed by the gridwork) whose closures 
lie in U. There may be none of these, there may be a finite positive number, 
or there may be infinitely many. Let K, be the closure of the union of these 
boxes B,. It should be clear that K, is also the union of the closures of the 
boxes B,, because they are all of fixed size so that no sequence of points 
from distinct boxes can converge. Since 


(JB,< Kk, =U 8, 
we have ᾿ ; 
XL m(B,) < m(Ki) < LY m(B,) = LY m(B,) 
so that ᾿ “ 
m(K,) = du m(B,,). 


Now apply the same process to the open set U — K,, using a gridwork 
of mesh }. One obtains a closed set K, < U — Καὶ, which is the union of ἃ 
(finite or infinite) sequence of closed boxes of size 4, the sum of whose 
measures is m(K,). Apply the process to U — K, — K, using a gridwork 
of size 4, and continue. The result is that 


U=UK, 


where K,, K,, K3,... are pairwise disjoint closed sets, and each K, is the 
union of a sequence of closed boxes whose measures total m(K,). Put all 
the boxes together and U is expressed as the union of a sequence of boxes 
whose measures total m(U). (See Theorem 16.) With a bit more care, one 
can express U as the union of a sequence of non-overlapping boxes (not 
closed ones, of course); but, from the point of view of measure, this is not a 
significant refinement. 


Definition. If S is a measurable set, the measure of S is the integral of 
the characteristic function of S: 


m(S) = | kg. 


Lemma. Let ὃ and T be measurable sets. 
( 0 -- m(S) < οο. 
Gi) Jf 5 <T, then m(S) < m(1T). 
(11) m(S U Τὴ < m(S) + m(T) and equality holds ifS 01 T= @. 
Proof. (i) and (11) are immediate. For (iii) simply note that 
Ksur < Ks zs Kr 
and equality holds if SM T= @. 


34] 


342 


The Lebesgue Integral Chap. 7 


Theorem 16. Let {S,} be a sequence of measurable sets and let 
B= Sia 
Then S is measurable and : 
(1) (continuity of measure) if 5. < S, < 8; c---, then 
m(S) = lim m(S,) 


(11) (countable additivity of measure) if ὅδ. \ 5, ~ @,i#j, then 


m(S) = Σ᾿ m(S,). 


n=1 


Proof. (i) and (ii) are equivalent assertions, because if S is the union 
of an increasing sequence {S,}, we also have 


S= δ, U (S, — §;) U (S; — 8.) U--- 
to which (111) is applicable; and we know that 
m(S,) = m(S,-1) + mS, — S,-1). 
To verify (i), let k, be the characteristic function of S,, so that 
ky<ik,<k3<::- 
lim k,(X) = k(X) 


where k = k,. Consider the limit 
lim m(S,) = lim { k,. 


If this limit is finite, the monotone convergence theorems tells us that k 
is integrable and its integral is the limit, i.e., 


(7.31) m(S) = lim m(S,). 


If the limit is -}co, then m(S) = co because m(S) > m(S,) for every ἡ. 
Thus (7.31) is trivially satisfied. 


We will have a very complete picture of “measure” as soon as we 
accomplish one more task. We must show that m(S) is what it ought to be 
for all simple sets S. First let us note that our notation is consistent with 
what we did earlier. 


(i) S is a set of measure zero if and only if S is measurable and 
m(S) = 0. This is immediate because 
[ ks=0 


if and only if Xs is a null function. 
(11) If B is a box, then m(B) is the k-dimensional volume which we 
previously denoted by m(B). See Exercises 4, 5, and 6 of Section 7.3. 
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Theorem 17. If S is a measurable set, then 
m(S) = inf {m(U); U open, S < U}. 

Consequently, m(S) = m*(S). 

Proof. First note that m(T) < m*(T) for all measurable sets 7. For, 
if {B,} is a sequence of boxes which cover 7, 

mT) < Σὲ m(B,). 

Now, if m(S) = co, then m(U) = οο for all open U such that δ c U. 
Similarly, m*(S) = oo. So the theorem is trivial in this case. | 

Suppose (then) that m(S) < co. This means that ἄς is an integrable 


function. Let ε > 0. By the corollary to Theorem 8, we can find an open 
set W and a real-valued function f € C,(R*) such that 


(a) m*(W) < ε and hence m(W) < €; 
(b) | f(X) — ks(X)| <6, X € W. 
Let V = {X; f(X) > 1 — e}. Condition (b) tells us two things: 
XE W,XES=——XEV 
XEW,XES=—NXE_E V. 


In other words, 
Sc<VU W) 


Veo(SU W). 
From the second relation we have 
mV) < m(S) + mW) < m(S) + e€. 
If we let U = V U W, then U is an open set which contains S and 
m(U) < m(V) + m(W) 
< m(S) + 2e. 


This establishes the first assertion of the theorem. 
To see that m(S) = m*(S) just note (Example 4) that each open set U 
is the union of a sequence of boxes {B,} so that 


m(U) = Σ m(B,). 
If S c U, then 
m*(S) < 3) m(B,) = m(U) 
n=1 

and, if we take the infimum over open sets U, we see that m*(S) < m(S). 

Corollary. If S is a measurable set, then 

m(S) = sup {m(K); K compact, K < S}. 
Proof. Apply the theorem to R* — S. 
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In many situations we are interested in the integration of functions 
which are defined on a subset of R* rather than on the whole space. For 
example, in the study of Fourier series, we are interested in functions on 
some interval of the real line, e.g., the interval [—z, z]. We want to have 
complete theories of integration on such subsets. For instance we want to 
assert that completion of the space of continuous functions on an interval 
[a, δ] relative to the norm 


[ \fedlax 


yields the space of Lebesgue-integrable functions on [a, b]. Fortunately, 
with the theory of integration on R* in hand, none of this presents us with 
any difficulties. All that we do is restrict the functions in L'(R*) to the 
subset and forget everything that happens outside it. 


Definition. If S is a measurable subset of R*, and f is an integrable 
function (or a non-negative measurable function) on R*, the integral of f over 


S is 
[ f= J ef 


The following properties are immediate from the definition and 
simple results which we know. 


(i) If S and 7 are disjoint measurable sets, 


Τὰ Jot Jf 


{ S=0. 
(iii) If fis real-valued, 
m(S)inff < | _f <m(S) sup f 


(ii) If m(S) = 0, 


Definition. If S is a measurable subset of ἈΞ, a measurable function on 
S is a complex-valued function f such that 


(a) f is defined at almost every point of S; 
(b) there is a sequence of continuous functions {f,} of compact support 
on R* such that lim f,(X) = f(X) for almost every X in S. 


An integrable function on S is one which satisfies (a), (b), and 
(c) the sequence of integrals [. |f,| is bounded. 


Lemma, Let f be a complex-valued function defined almost every- 
where on S. The following are equivalent. 


(i) f is a measurable [integrable] function on S. 
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(11) There is a measurable [integrable] function g on R* such that the 
restriction of g to S is f, i.e., g(X) = f(X) for allX ε 5. 
(1) The function 
1, ons 
ΞΕ ΞΞΞΞ 


0, of 5 
is a measurable [integrable] function on R*. 
Proof. Exercise. 
Armed with this simple lemma we have available on S all of the 
convergence theorems we developed for measurable and integrable func- 


tions on R*. Of course we denote by L'(S) the space of integrable-functions 
on S and endow it with the norm 


1711, -- [17] 


It contains the restriction to S of every continuous function of compact 
support, and is complete relative to the L'-norm. All of this is apparent 
because L'(S) is (can be identified with) the subspace of L!(R*) consisting 
of those functions which vanish outside of S. Of course we will not use the 
notation ||. ..||, for the Z'-norm on S unless it is clear from the context 
that we have restricted our attention to functions on S. 


Let us establish one non-trivial result about integrals over subsets. 


Theorem 18. Let f be an integrable function on R*. For each ¢€ > 0 
there exists ὃ > 0 such that 


{ ΠῚ τε 
for every measurable set S such that m(S) < ὃ. 


Proof. Suppose the result fails for some ε. Then there is a sequence 
of measurable sets {S,} such that 


mSs,) <2" 


Let 


Since Εἰ, > E, > E; >---, the dominated convergence theorem states 
that 


j,\fl=lim] flee 


On the other hand, m(E) = lim m(Ey) = 0, a contradiction. 
N 
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Exercises 


In view of the large number of exercises in this section, we have divided them 
into three general categories. 


A. Measurable Functions 


1. Be sure that you see how statements (i)-(v) on p. 337 follow directly from 
the corresponding statements about continuous functions. 


2. Prove that every null function is a measurable function. 


3. If fis a non-negative measurable function and {g,} is any sequence of inte- 
grable functions such that 


δι S82. S83 5°°° 
lim g,(X) = f(X), a.e. 
prove that lim [ϑ. ΞΕ ἡ 


4. If fand g are non-negative measurable functions and c > 0, then 


fir+a=cfrt fe. 


5. If fis a non-negative integrable function, discuss 
Ϊ log f. 


Show that it is always sensibly defined if we allow —co as a value. (Hint: 
logx<xifx>1.) 


6. Let ὦ be a fixed non-negative integrable function. Let L(@) be the set of 
measurable functions f such that f@ is integrable. For f ε L(@) define 


Isil={iflo. 
Prove the following: 
(a) (L(@), ||. . . ||) is a semi-normed linear space. 
(b) That space is complete. 
7. Let 


f(xy=sint, 0<|x/<1. 


Is f an integrable function on [—1, 1]? 


8. Let f be a fixed real-valued integrable function on R*. For each measurable 
set S define 


F(s) =f. 


Let 
ὁ = sup F(S). 
S 


Sec. 7.7 Measurable Functions and Measurable Sets 


True or false? There exists a measurable set S such that 
F(S) = δ. 
True or false? The image of F is connected. 
9. Let K be the set of measurable functions fsuch that 0 < f< 1. 


(a) K is a convex set. 
(b) fis an extreme point of K if and only if fis the characteristic function of a 
measurable set. 


10. If fis a real measurable function on R* and g is a continuous function on 
the real line, then the composition go f is a measurable function. 


11. If f is a continuous function on Αἱ, if g is a measurable function on R', 
and if f~1(E) is measurable for every set of measure zero E < R!, then the com- 
position go f is measurable. 


*12. Define a function fon the interval [0, 1] as follows: If 
Χ ἘΞ >> a2": a, =Oorl 
n=l 


is the binary expansion of x, define 
FX) = Σ and. 


(a) Show that fis well-defined almost everywhere and that fis a measurable 
function on [0, 1]. 

(0) Show that f maps [0, 1] into the Cantor set. 

(c) True or false? If g is a continuous function (on the Cantor set), then go f 
15 a measurable function. 


B. Measurable Sets 

13. Prove that the set S < ΚΑ is measurable if and only if ΚΑ — S is measurable. 
14. If {S,,} is a sequence of measurable sets, show that ἰ ) S, and () S, are mea- 
surable sets. ; ' 

15. If S is a measurable set and m(S) > 0, there exists a point Χὶ ε S such that 
m(N τὰ S) > 0 for every neighborhood N of X. 


16. (Translation-invariance of measure) If S is a measurable set, each translate 
of S 
Y+S={Y+X;Xe 4) 


is measurable and m(Y + S) = m(S). 


17. If B is a ball in R*, then (B is measurable) and m(B) depends only on the 
radius of B. 3 


18. Let θ be an angle, 0 < θ < 22. Let S be a rectangle in R? and let S, be the 
rectangle obtained by rotating S through the angle θ about the origin. Show that 
S can be decomposed into a finite number of triangles which can be translated 
to new positions in R2 and reassembled to form S,. Conclude that Lebesgue mea- 
sure on ΚΖ is rotation-invariant. 
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19. True or false? If S and 7 are measurable sets, the algebraic sum 
St7T={X+Y;xXeS8,YeT} 
is a measurable set. 


20. Construct a set S in the plane such that mS) = 0 but uncountably many 
vertical lines meet S in a set of positive 1-dimensional measure. 


21. Describe how to construct a measurable subset S of the interval [0, 1] such 
that, in each subinterval of [0, 1], both S and its complement have positive mea- 
sure, 1.e., 10 <a <b <1, then 


0 < m(S ἡ (a, b)) < ὃ — a. 
22. There does not exist a measurable subset of [0, 1] such that 
m(S O (a, b)) = 4(6 — a) 
for each subinterval (a, ὁ). 


23. Let f be a real-valued function on R*. Show that fis measurable if and only 
if the graph of fis a measurable set in R*t!. 


Ἐ24. If S is a linear subspace of R* and if m(S) > 0, then S = κα, 


25. If S is a measurable set and fis a real-valued integrable function, then 
m(S) inf f< | f< m(S) sup ἡ 
S S 
26. Define the inner measure of the (arbitrary) set S to be 
m,(S) = sup {m(K); Καὶ compact, Καὶ < S$}. 
Prove that S is measurable if and only if m*(S) = m,(S). 
27. A family § of subsets of R* is called a sigma-algebra of sets if it satisfies 


(i) the empty set and R* are in §; 
(ii) if S ε ἢ, then the complement R* — S is in ὃ; 
(iii) if {S,} is any sequence of sets in §, the union (J S, is in §. 


Prove the following. 


(a) The intersection of any collection of sigma-algebras is a sigma-algebra. 

(b) Given any family of sets in R* there is a smallest sigma-algebra of subsets 
of R* which contains the family. (This is called the sigma-algebra generated by 
the family.) 

(c) The sigma-algebra generated by the family of compact subsets of R* is 
the same as the sigma-algebra generated by the family of closed boxes in R*. 
(This sigma-algebra is called the family of Borel sets in R*.) 

(d) A set S in R* is measurable if and only if it differs from some Borel set 
by a set of measure zero. 


28. (Invariance of measure under rigid motions) If S is a measurable subset of 
R*, show that 
m(S) = inf Σ m(B,) 
{Bn} n=1 


where the infimum is taken over all countable coverings of S by open balls. Use 
the result of Exercise 17 to show that Lebesgue measure on Κα is invariant under 
(unchanged by) rigid motions of R*. 
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29. (Carathéodory) The subset S of R* is measurable if and only if 
m*(E) = m*¥(EO 5) + m*(E 9 (RF — S)) 


for every set E < κα, 


C. Simple Functions and the Lebesgue Process 


30. Let f be a measurable function. If K is a compact set in the complex plane, 
then f—!(K) is a measurable set. Do the same for f-1(U ) where U is open. (Hint: 
It’s true for continuous functions.) 


31. A simple function is a function s of the form 
: N 
s= > a,kz, 
n=1 


where a,,...,a, are constants and E,,..., E, are measurable sets. In other 
words, a simple function is one which is a linear combination of characteristic 
functions of measurable sets. Prove the following. 


(a) The simple functions constitute a linear subspace of the (bounded) mea- 
surable functions. 

(b) If fis a bounded measurable function, then fcan be uniformly approxi- 
mated by simple functions. (Hint for (δ): Take a square in the plane which con- 
tains the image of Καὶ Cut it up into small squares, and let the £,’s be the inverse 
images (under 7) of those squares.) 


32. If fis integrable and € > 0, there is a simple function s such that 
[\lf-si<e 


33. Let f be a real-valued function defined almost everywhere on R*. Suppose 
that the set 
Ε, = {X; f(X) > t} 


is measurable, for every real number ¢. Prove that f is a measurable function. 
(Hint: Use complements, intersections, etc. of the sets E, for various t’s to see 
that {X; f(X) < th}, {X3;a < f(X) < 5}, etc. are measurable. Approximate f by 
simple functions.) 
34. (Lebesgue’s ladder) Let f be a non-negative integrable function. Let 

Ειι ={X;0<f(xX) < 5} 

Ey, ={X;4<f(X) <I}. 
Let 

S, = 0- Μμ(Ε11) + Ζμ(Ε 42). 
Now divide the interval [0, 2) into 8 subintervals, [0, 4), [4, 4), ...,[7, 2) and let 
E.,,..-., Exg be the inverse images under f. Let 


5.) = 0- μ(Ἐ 21) + Ζμ(Ε1:) + -.- + Ζμ(Ε29). 


Now, divide [0, 3) twice as finely, 1.6., into 24 subintervals, form $3; and continue. 
Prove that 


lim S,= [Δ 
k B 


(Be sure to draw a picture of what’s going on. Show that there is nothing sacred 
about the particular partitions of the real line we used.) 
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D. A Non-Measurable Set 


35. Lebesgue measure on the real line induces an analogous measure (arc length) 
on the unit circle C. The measure of any arc 


{e9;0,<0<98,} 


is (9, — @,), the outer measure of a subset of the circle is the infimum of the sum 
of the lengths of the arcs in the various countable coverings by arcs, etc. Just 
as Lebesgue measure on the line is translation-invariant, this measure on the 
circle is rotation-invariant: If S is a measurable subset of the unit circle, then 


m(e”S) = m(S). 
Let G be the set of all points οἷ", n = 0, +1, +2,.... These points are distinct 


because οἷ" = e’* is not possible with m and & distinct integers. Consider the 
various rotations of G: 


eG = {eila+: ἢ = 0, +1, +2,...}. 


(a) Prove that eG = οἷν if and only if e@-) ε G, i.e., if and only if 
0 — yw =n + 2kx where ἡ, k are integers. 

(b) Prove that G has uncountably many distinct rotations. 

(c) According to the axiom of choice (Appendix) there is a set S comprised 
of exactly one point from each of the distinct rotations of G. For such an S, 
show that the rotations eS are pairwise disjoint and exhaust the circle: 


LJ e*@S = C. 


(d) From (c) show that S cannot be a measurable set. (Hint: By the rotation- 
invariance, the various sets eS must all have the same measure.) 


7.6. Fubini’s Theorem 


We come now to one of the most important theorems in mathematical 
analysis. It is the extension to the present context of the result that an 
integral over a rectangle can be reduced to iterated integrals over intervals, 
1.6., that we can integrate one variable at a time: 


[fe dx dy = [Vf fee») ay} ax 


We treated this for continuous functions on boxes in Chapter 4. The 
result for Lebesgue-integrable functions on R*—known as Fubini’s theorem 
—is considerably more difficult to prove, but the effort is worth it, because 
what emerges is an extremely powerful technical tool. 

Let us begin by restating the basic result for continuous functions of 
compact support. 


Theorem 19. If f is a continuous function of compact support on ἈΠ 
then 
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(i) for every X in R* the function f(X, +) is a continuous function of 
compact support on R‘; 
(ii) the function g defined by 


aX) -- [ Χ, +) 


is a continuous function of compact support on R*; 


(11!) = z= - f. 
Proof. Let B be some closed box outside of which f = 0, and apply 
Theorem 10 of Chapter 4. 


In presenting Fubini’s theorem, we shall depart from our usual format. 
We shall first state the result and then present the heart of the proof in the 
form of three lemmas. 


Theorem 20 (Fubini’s Theorem). If f is an integrable function on ἈΠ’, 
then 
(i) for almost every X in R* the function f(X, -) is an integrable func- 
tion on Β΄; 
(1) the function g defined (almost everywhere) by 


a(X) =| fx, +) 


is an integrable function on R*; 


(aul) - = ἊΝ ft. 

Notice the difficulty we are in right away. If f is continuous, then for 
each X the function f(X, -), the value of which at Y is f(X, Y), is a 
perfectly well-defined function on Κ΄. But if f is (merely) integrable there 
are a great many points at which the value of f is not defined, and we must 
prove that for almost every X in R* value f(X, Y) is defined for almost 
every Y in R’. Thus we must know something about the relationships 
between sets of measure zero in R*, R’, and τ τ 


Lemma 1. If S is a set of measure zero in R**‘, then, for almost every 
X in R*, the section 
Sx = {[Y ε Ε΄; (Χ, Y) € 8} 
is a set of measure zero in Β΄. 


Proof. Since m(S) = 0, Theorem 2 tells us that there is a sequence of 
real-valued functions { f,} of compact support on R*** such that 


(i) fr <fo <fg Ξ-.---; 
(11) the sequence of integrals { i A is bounded; 
(111) lim f,(X, Y) = co for every point (X, Y) € S. 
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For each X in R* the sequence { f(X, -)} is increasing in C,(R‘). Further- 
more the (values of the) sequence diverge at every point Y which is in the 
section S,. Therefore (Theorem 2) if we can show that 


lim f SAX, +) < © 


for almost every X, we will have proved the lemma. To this end, let 


g,(X) = [AX ἧς, a 


According to Theorem 19, 


{ δ - Tas 
Re Ree 


lim { g, = lim | fee, 
Re Ree 


so that 


Since {g,} is an increasing sequence, this implies that 


lim g,(X) < οο, a.e. 


and the lemma is established. 


Lemma 2. If f; <f, <f,;<--- is a monotone-increasing sequence of 
functions in 1,1 for which Fubini’s theorem holds, and if the limit function 
f = lim f, 


is integrable, then Fubini’s theorem is valid for f. 


Proof. We begin by applying Lemma 1 to the set of points (XY, Y) 
such that f,(X, Y) does not converge to f(XY, Y). We conclude that there 
is a set of measure zero E < R* such that if X ¢ E, then f,(X, Y) con- 
verges to f(X, Y), for almost every Y in Α΄. Associated with each f, is a 
set of measure zero E, ΞΞ R* via statement (i) of Fubini’s theorem: If 
X € E,, then f,(X, -) is an integrable function on R*’. The union 


EUE,VE,U::: 
is a set of measure zero. Therefore for almost every XY in R* we have 
(a) lim f,CX, Y) = f(X, Y), for almost every Y in Μ΄; 
(b) for every n, f,(X, -) is an integrable function on R‘. 


For such an X, the sequence 
B(X) = | flX 9) 


is (defined and) an increasing sequence. Since Fubini’s theorem holds for 
fy each g, is in L'(R*) and 
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S.= | οἷς 
Re RReE 


so that 
lim | g, = lim | fee. 
n Re n κατέ 
Since {g,} is an increasing sequence, the monotone convergence theorem 
implies that for almost every XY 
(c) lim g,(X) < ©. 


If X.< R* satisfies conditions (a), (b), and (c), then the monotone conver- 
gence theorem guarantees that f(X, -) is an integrable function and 


[,, ας +) = lim f f,06 +) 
= lim g,(X) 


which we know to be an integrable function of Y. We also know that if 


g(X) = lim 5,(ΧῚ) 
then 


j.e=lim |g, 


= lim sie 


" I veel 


Lemma 3. Fubini’s theorem is valid for every bounded measurable 
function of compact support. 


Proof. Let f be a measurable function, bounded by 1, and suppose f 
has compact support. There is a sequence { f,} in C,(R**’) which converges 
pointwise to f, almost everywhere. We can easily arrange that each f, is 
bounded by 1 and that there is a fixed compact set outside of which every 
J, vanishes. We know that Fubini’s Theorem is valid for each f,. The proof 
that it is valid for fis now virtually identical with the proof of Lemma 2, 
except that one uses Lemma | and the bounded convergence theorem 
rather than Lemma 1 and the monotone convergence theorem. 


Proof of Fubini’s Theorem. The family of integrable functions for 
which Fubini’s theorem holds is obviously a linear subspace of L'(R**?). 
Therefore it suffices to prove it for non-negative functions. If fis a non- 
negative function in L', let 


7}, -- ἡ A kx, 


where K, < K, - K;c --- arecompact sets which exhaust R**“. Lemma 
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3 tells us that Fubini’s theorem is valid for each f,, and since 


7 pe ee Ξ οΣ 
lim f,(X) = f(X) everywhere that f(XY) is defined 


Lemma 2 says the theorem is valid for αὶ 


Corollary. Let S be a measurable subset of R**’. Then, for almost every 
X in R*, the section 


Sx = {Y ε R‘; (X, Y) € 5) 


is a measurable subset of R‘, and 
m(S) = Ϊ _, MSx) dX, 


Proof. This is just Fubini’s theorem applied to the characteristic 
function of S. 


Corollary. If S is a measurable subset of R* and T is a measurable sub- 
set of R‘, then the Cartesian product S x T is a measurable subset of R**‘, 
and m(S Χ Τὴ = m(S)m(T). 


Proof. Once one has shown that S x 7 15 measurable, the formula 
mS Χ T) = m(S)m(T) is immediate from the previous corollary. We 
have left the proof that S x 7 is measurable to the exercises. 


In practice we will often use the familiar notation 


J sonax=J if 


to denote the Lebesgue-integral of fover R*. In fact, we did this in the first 
corollary above. This notation is especially helpful in expressing the Fubini 
result: 


(7.32) Ι [ [1 ΥἹ dy dX = [ f(X, Y)dXaY. 


In the last integral we wrote dX dY in place of dX, Y). This notation was 
avoided in the proof of Fubini’s theorem, so that we would be forced to 
think carefully about what type of object each symbol represented. 
Two comments are in order, First, obviously the order in (7.32) is not 
important, that is, we may just as well integrate first with respect to X 
and then with respect to Y. Second, it is extremely important to bear in 
mind that the validity of Fubini’s theorem depends on the a priori knowl- 
edge that fis integrable over R**’. If f1s merely measurable, both of the 
iterated integrals may exist yet be different. Fortunately this cannot happen 
if fis a non-negative measurable function; and the result for non-negative 
functions is the most useful tool for testing the integrability of functions on 
R*** which are made up out of functions on R* and functions on Αἰ. 
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Theorem 21 (Fubini). If f is a non-negative measurable function on 
R**") then 


(i) for almost every X in R*, the function f(X, +) is measurable; 
(11) the function 


BX) = |X, +) 


is a measurable function on R* if we allow +00 as a value; 


(iii) I. & -- ine, 
Proof. What do we mean in part (ii) where we say “if + co is allowed 


as a value”? At the simplest level we mean, of course, that since fis not 
assumed to be integrable we may have 


|, τα, +) = © 


for quite a few values of XY. But we mean something more. We mean that 
the function g, which is defined almost everywhere on R* and maps into 
the extended non-negative real number system, is measurable in the sense 
that there is a sequence of non-negative continuous functions of compact 
support which converges to g pointwise almost everywhere. For such a 
function g the truncated functions g Λ n are measurable in the usual sense 
and converge monotonely upward to g. Since the monotone convergence 
theorem works just as well for infinite limits as finite ones: 


fish Shs S-:: 
= lim f, 
7 n I —, f= oo 
lim | f, = 0 
we have no trouble in defining the integrals of non-negative functions 
which are measurable in this extended sense. The proof of Fubini’s theorem 
for these extended non-negative measurable functions is just (the proof of) 


Lemma 2 plus the fact that we know Fubini’s theorem for the truncated 
functions, 


(f A nkx,- 


Corollary. Let f be a measurable function on R***‘. If either of the 
iterated integrals 


f {fitex, voiay} dX 
{{Ππ| y)|ax} dY 


is finite, then (they both are and) f is an integrable function. 
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Proof. Since Fubini’s theorem is valid for | f|, the finiteness of either 
integral will imply the integrability of f because they are both equal to 


{ Ι ΧΧ, Y)| aX dy. 


EXAMPLE 5. Let fand g be integrable functions on R*. Associated with 
the pair (3 5) is an important “product” known as the convolution of f 
and g: 


(f * aX) = [Χὰ — Y)g(¥) a¥. 


It is far from clear that the integral on the right makes sense. After all, 
the product of two integrable functions (although measurable) need not 
be integrable. But we can use the second Fubini theorem to show that, 
given f and g, it is true that for almost every X the product f(X — Y)g(Y) 
is an integrable function of Y. 

First note that, for any given f, the function 


F(X, Y)=f(X — ¥) 
is a measurable function of X and Y, i.e., a measurable function on 
R* χ R*. This can be verified as follows. If { f,} is a sequence of continuous 
functions which converges pointwise almost everywhere to f, we want to 
show that f, converges pointwise to f, almost everywhere on ΜῈ x R*. 
If E is the set of points X in R* such that f,(X) does not converge to f(X), 
then f,(X, Y) fails to converge to f(X, Y) on the set 


(7.33) {(X, Y);(X¥ — Y) © E}. 


We need to know that, if £ is any set of measure zero in R*, then (7.33) is 
a set of measure zero in R?* = R*¥ x R*. We have left the proof of this to 
the exercises. Once it is verified, we have shown that f is a measurable 
function on R2*, 1.e., that ((Χ — Y) is a measurable function of X and Y. 


Now, given fand g in L'(R*) we know that 
WX, Y)=f(X — Y)g(Y) 


is a measurable function on R?*. We test its integrability using Theorem 
21, by integrating | hCX, Y)|, first with respect to X, then with respect to Y: 


[lacy Υ)4Χ = [1Χ -- Ye) ax 


=la(¥)| [1 — γ)1Χ. 
Evidently, 
ΤΙ -- yy lax = fi sma; 


1.6., translation by (— Y) does not effect the L}-norm. (Why? Because this 
is obviously so for continuous functions of compact support.) Thus 
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flay, Υ)} αΧ =F Ih | leQlay 


= Fill lgth- 


Thus, A € L'(R**) and |[Al|; = || Fh llalh- 

Now we may apply the Fubini theorem (Theorem 20) to conclude 
that for almost every X, the function f(X — Y)g(Y) is integrable with 
respect to Y, that the function 


(f * g(X) = | f(X — Y)a(¥) a¥ 


is an integrable function in R*, and that 


| f* x) ax = [7 -- Υ)ε( 7) aX ay 
= (J)(f2) 


If Ἐ 51} <M fF ΠΗ} 


The operation of convolution on L!: 


(f,2)—>f*g 


plays a central role in the theory of Fourier series and Fourier integrals. 
At this stage, the point we wish to emphasize is how the use of the second 
Fubini theorem, integrating first with respect to XY and then with respect 
to Y, makes it clear that f(Y — Y)g(Y) is integrable on R2*. Thus, whereas 
it was not clear a priori that there need exist a single X such that 
S(X — Y)g(Y) is integrable with respect to Y, it follows that this is so 
for almost every _X. 


We also know that 


Exercises 


1. Let S be a set of measure zero in R* and let T be a bounded subset of Μ΄. 
Prove that the Cartesian product S x Tis a set of measure zero in R**¢, 


2. Use the result of Exercise 1 to prove that, if S and T are subsets of Νὰ and Κ΄, 
respectively, and if either S or T is a set of measure zero, then S x Tis a set of 
measure zero. 


3. Let T be a fixed subset of Μ΄. Let S$; be the family of all subsets S of δὰ 
such that S Χ 715 a measurable subset of R**¢. Prove that 


(a) Sr contains every set of measure zero in R*; 
(b) if S € ὃτ, then the complement (ΚΑ — S) is in τ; 
(c) if {S,,} is a sequence of sets in §7, the union ) S, is in 91. 


Note that (Ὁ) and (c) imply that $7 is also closed under countable intersections. 
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4. Show that, if S is any measurable set in R*, there is a set S with these prop- 
erties: 
(a) Sc ὃ 
(b) S =() U,, where each U, is open (and therefore the union of a sequence 
of boxes). 
(c) m(S — S) = 0. 
5. Use the results of Exercises 3 and 4 to prove that, if B is a box in R‘, then 
S x Bis a measurable subset of R*+‘, for every set S in Κα, 


6. Use the results of Exercises 3, 4, and 5 to prove that, if S is a measurable 
subset of R* and T is ἃ measurable subset of R’, then S Χ Tis a measurable sub- 
set of κατέ, (Its measure is, of course, m(S Χ 7) = m(S)m(T).) 


7. Prove that, if fis a measurable function on R* and g is a measurable function 
on ΑἹ, then A(X, Y) = f(X)g(Y) is a measurable function on R?*. (Hint: Use 
Exercise 2.) 


8. From Exercise 7, do you see a simple proof that the Cartesian product of two 
measurable sets is measurable? 


9. If fis a real-valued function on R*, the graph of fis a set of measure zero in 
Εκιι. 


10. If fis a non-negative measurable function on R*, the set 
S={(Xy ε RU  -  - (Δ) 


is a measurable subset of R**! and 


m(S) = | f. 
11. Let 
f(x, 9) ==>; 


Is f integrable on the square [0, 1] Χ [0, 1]? What about 
| Ἔ eer 
7, ») =a γ᾽ 


12. Is 


1 
0 ΈΞΞΣΣ τσ “3 


integrable on the cube [0, 1] x [0, 1] x [0, 1]? 


13. Let p(x,,...,X,) be a polynomial in x variables, i.e., a polynomial function 
on R". Let N be the null set (zero set) of p: 


Ν-είχε Καὶ p(x%1,..., Xn) = 0}. 


Show that Ν is a set of measure zero, unless p = 0. (Hint: Use Fubini and induc- 
tion on 7.) 


14. Prove that almost every & Χ k matrix is invertible. 
15. Let E be a set of measure zero in R*. Prove that the set 
S = {(X, Y) © R**3;(X — Υ) ε ΕἸ 
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is a set Of measure zero in R2*. (Hint: It suffices to prove that each bounded part 
of S has measure zero. Handle such a bounded part just as in Exercise 1, except 
that the sets X — Y = constant are used in place of the sets X¥ = constant.) 


16. Prove that the operation of convolution on L!(R*) has these properties. 


(i) (f+ g)*h =fx« (gh); 
(il) fg =—grf,; 
(ili) f* (eg +h) =c( fxg) + fh; 
(iv) If fis of class C”, then f* g is of class C" for all g € L'(R*). 


17. If f ε L'(R*) and g is a bounded measurable function on R*, the convolu- 
tion 


(f+ 8X) = [χὰ -- Near 


is a uniformly continuous function on R*. 


18. Use the result of Exercise 17 to prove the following. If E is measurable set 
of positive measure in R*, the algebraic difference set 


E—E={X—-—Y;XeE, Ye ΕἸ 


contains a non-empty open set. 


*19. True or false? Almost every real 1 Χ ἢ matrix has a real characteristic 
value. 


7.9. Orthogonal Expansions 


Let E be a fixed measurable subset of R*. The collection of all mea- 
surable functions f on E such that | f |? is integrable over E forms a com- 
plete inner product space (Hilbert space) in a natural way. This makes it 
possible to expand each function in this space—a space which we denote 
L?(E)—in a series of pairwise orthogonal functions. This can be done ina 
variety of ways, depending on the orthogonal sequence of functions which 
is of particular interest in a given mathematical situation. This section 
will provide only a glimpse of this important set of ideas. The only orthog- 
onal system we shall discuss in any detail is the system of functions e’"*, 
n= 0, +1, +2,...0n the subset E = [—z, π] of the real line. This will 
round out the general aspects of our discussion of Fourier series. 


Notation. Let p be a positive real number, and let E be a measurable 
subset of R*. We denote by L?(£) the collection of all measurable functions 
f on the set E for which 


[ If P «οὐ. 


Note that if g is a non-negative measurable function on E and 0 < p 
<_oo, then g” is a measurable function on E. (See Exercise 10, Section 7.7.) 
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Thus the definition of L?(£) makes sense. In case p = 1, it agrees with our 
previous definition of L'(E). 


Convention. As with L'(E), we agree to call two functions in L?(E) 
equal whenever they differ by a null function. 


Now it is a fact that for every p, | < p < oo, L?(E£) is a Banach space 
relative to the L?-norm: 


(7.34) fll =(f ire)”: 


We carry out the proof only for the case we are interested in. 


Lemma. If f and g are functions in L?(E), then the product fg is in 
L'(E). Furthermore, 


(7.35) ὍΤΙ Μ᾿ 
Ε 
is an inner product on L?(E). 
Proof. The properties required of an inner product are: 
(1) of aan ee 5) ἘΡΣ chi, 5) ἘΣ «(,, 5) 


Gi) eye ΣΡ ." 

(iii) <f, f> = 0; if <f, f> = 0, then f= 0. 
We saw in Chapter 6 that any inner product satisfies the Cauchy-Schwarz 
inequality: 


Kf P< 7, S><e 8 
The (standard) proof of this goes as follows. Note that 
O<|[f + cel =<f + cg, f + cg> 
= (7, 7) + & f> + οἷ FS, & + le PKe, & 


for every complex number c. Choose c so that the sum of the last two 
terms is 0: 


«7, 5) + ες, 8) = 0. 
We can do that unless g = 0. (If g = 0, the inequality which we want to 
prove is trivial.) Put that value of c into the inequality 
0 «1. + cell 


and out comes the Cauchy-Schwarz inequality. 
Thus we know that 
{ler 


on any subspace where (7.35) is known to make sense. For instance we 
know (7.36) on the subspace of L?(£) consisting of the bounded mea- 


(7.36) 
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surable functions of compact support. From the monotone convergence 
theorem and the fact that 


({istiel) <fire fier 


on this subspace it is immediate that (fg) € L'(E) for all fig ε L*(E). 
The last minor point which requires comment deals with condition 
(iii) for an inner product. In the case at hand, this says 


GS. fA ἢ 


Ξ 117} 
-ὸ 
and that ΙΓ (7, f> = 0, then f= 0. Note that the last statement is legitimate 
because of our convention that null functions are equal to 0. 
Theorem 22 ( Riesz-Fischer). L?(E) is a Hilbert space, i.e., a complete 
inner product space. 
Proof. The triangle inequality 
f+ glk <US lk + llelk 


is immediate from the Cauchy-Schwarz inequality and the relation 


I Fl = Ὁ... 


Thus what we are proving is that (1.2, || - - - ||,) is a Banach space. Consider 
a fast Cauchy sequence in L?(E): 


p> Hite ie ll, = Μ « οο. 


We want to show that { f,} converges. Let g, = f, -- f,_, (with fy = 0) so 
that 


¥ Ul galh = M < 00 


and we want to prove that the series 


converges in L?(E). Let 
h, = D3 [5] 
so thath, Ε 1.3(Ε),0 - ἅς < ἢ <A; <--- and (by the triangle inequal- 
ity) 
alle < SU Balle <M. 
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Thus 
O<h<h, <A <::- 


[."} -Ξ- Μ: 


from which it follows (monotone convergence) that 
A(X) = lim h,(X) < οο, a.e. 


he LYE) 


h? = ji h?. 

j A? = lim J 

At any point X where A(X) < oo, the series δ) g,(X) is absolutely conver- 
gent. If we let 


f= Ys, ae. 


then since 
7, = >» 8; 
j=1 
fal <a 33 ἢ 


we see that f € L?(E) and |f|< h. Since fand f, are dominated by h, we 
have 
lf i ? = 4h? 


lim f,(X) = f(X), a.e. 


so, by the dominated convergence theorem, 
lim | |f — J.P = 0. 


The completeness of L? is very useful in discussing various types of 
series expansions. In a variety of problems important in mathematics and 
physics, one encounters expansions into series of orthogonal functions. 


Definition. Let {f,} be a sequence in L?(E). We call the sequence ortho- 
gonal if 
Cit > 0, nse k 


and we call it orthonormal if 


ln=k 
fy, f= Ou = | 


0, nk. 


If we have an orthogonal sequence { /,}, and if each f, #0, we can 
transform it into an orthonormal sequence by dividing each f, by its norm. 
Thus, the two concepts are only a whisker apart. We might put it this way: 
In working with orthogonal sequences, we may as well normalize so that 
each function has norm 1. 
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EXAMPLE 6. The most famous orthonormal sequence is that given by 
the trigonometric functions on the interval E = [—z, a]: 
SAX) = e!"*, n= 0, ἘΠῚ Ste 2 pitt 


It is customary here to normalize so that E has measure 1. This means 
that we absorb a factor of (27)~! into the inner product: 


{7,45 -- ge | Sedga* dx 


EXAMPLE 7. The Legendre polynomials P, have these properties: 


(i) P, is a polynomial of degree n (n = 0, I, 2,...); 
(ii) {P,} is an orthogonal sequence on the interval [—1, 1]. 
Those conditions determine P, to within a constant factor and the poly- 


nomials are: 


_,a"(x? — 1)" 
= 29} ΠΡὄῚ τ τ 1. 
P(x) = (2"n!) Pe ἿΣΞ 

Lemma. If {g,} is a linearly independent sequence in L?(E), there is a 
unique orthonormal sequence {f,,} such that, for each n, f, is a linear combina- 
tion of δι... .. Βα. 

Proof. The sequence { f,} can be constructed recursively as follows. 
Let 

I 
ἐμ Τὼ 

After f,,...,/,-1 have been found, let 


ἢ, 


In = THT 


where 


hy = 4, -- 3 Bn Κι. 


The process in this lemma enables one to construct a great many 
orthonormal sequences. For example, on any bounded set E one may 
apply the process to the sequence of polynomial functions 1, x, x?,.... 
The Legendre polynomials are the result of doing this on E = [—1, 1]. 


Theorem 23. Let {f,} be an orthonormal sequence in L?(E). The follow- 
ing are equivalent (all true or all false). 


(i) Ifh € L? and <h, f,> = 0 for all n, then ἢ = 0. 
(u) For every f in L?: 
f= >) cf, fh>f, (series converging in L?). 
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(iii) For every f in L?: 
718 = SKB DP. 
Proof. Let { f,} be any orthonormal sequence. Let f < L?(E). Let 
C= Aho 


n 


5, = da late 


k=1 


(Sut > — (Σ Cifist > 


Notice that 


= ete f>- 
Since 


ΩΣ ἘΞ «ΠΏ = οὗ 


we have 
(gu f> Ἐν p> [eel 


From the orthonormality relations (ἢ, f,> = 0,., it is easy to compute 
that 


gall} = 3 lee, 
Therefore, 


Li “,||2 ΞΞ (7 -- 5... -- 2 
Ξε <i of > =< ἢ ΞΕ Κ᾿ νΒΩΣ Foe? 


-- (5, -- Σ leek. 
Thus we have Bessel’s inequality: 
Slee? 171} 
ce = (7, faD- 


Since that holds for each n 
Slee? SIS IB 
Now, we can verify that {g,} is a Cauchy sequence in L7(E): 


||, — 8wllz = ΟΣ lc, |, n>wN. 


+1 
lim δ lc, |? = 0. 
Ν k=N+1 


Since L? is complete, there exists g < L? with 


lim ||2 — 8nll2 = 9, 
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1.e., the series 


converges in L?. Let 


It is a simple matter to verify 
ale =llf — elle 


= lim If ~ α, 8 

=I IE — Σ lee? 
k=1 

Ξ ΣΙ -- "1418. 


Similarly, 
<h, f,> =e 0, all ne 


All of this is valid for any orthonormal sequence. Now, look at the 
properties (i), (ii), (iii) in the statement of the theorem. Property (i) says 
that, for every f € L?, the corresponding A is 0. Property (ii) says that, for 
every f Ε L?, the corresponding g is equal to f. Property (iii) says that, for 
every f © L?,|| f ||, = || g||,. It should now be clear that those are equiva- 
lent properties of the sequence { f,}. 


What does Theorem 23 tell us? Suppose we call the orthonormal 
sequence { f,,} complete provided 0 is the only element of 1.2 which is ortho- 
nal to every f,. We then know this. If {f,} is a complete orthonormal 
sequence in L?, every fin L? can be expanded in a series 


f= Σ οὠ, 

which converges to fin L?-norm. The coefficients in the expansion are: 
C= SSD 
=| ft. 


The functions f, serve as a basis of mutually perpendicular unit vectors. 
The coefficients c, are the coordinates, much as in Euclidean space. Norms 
can be computed using these coordinates: 


I FIE = Slee. 


In fact, if 


then 


365 


366 


The Lebesgue Integral Chap. 7 


The reader may have seen a proof that the trigonometric sequence 
fix) =e", n=0,+1,+2,... 


is complete in L?(—z, z). Suppose you know Weierstrass’ approximation 
theorem, in this form (third corollary, Theorem 24, Chapter 5): If g is a 
continuous function on [—z, z] and if g(—z) = g(a), then g can be 
uniformly approximated by trigonometric polynomials 


P(x) = Σ a,e'**, 
k=-n 


Suppose one knows that. Suppose ἢ" € 1,2 and all of the Fourier coefficients 
of h are 0: 


Ϊ Af*=0, n=0,+1,+2,.... 
E 
By the Weierstrass theorem, 
hg = 0 
E 


for every continuous g with g(—z) = g(z). Use pointwise bounded ap- 
proximation and the dominated convergence theorem, to see that 


[μετ 


for every bounded measurable function g. Now choose a sequence of 
bounded measurable g,’s such that 


len] <|A| 
2, —> h*, a.e. 
We obtain 
Ι hh* = 0; 
E 
so h = 0. 


Theorem 24. Let f < L?(—z, 2) and define the Fourier coefficients of f 
by 
— Ι “ ~inx 
Ca = 5π ia f(x)e~™* dx. 


Then the partial sums of the Fourier series 


3 c,e'™* 


πΞ-οο 


converge to f in 1,2 norm, and 


| π 
Slew? =e [ ΠΡ ἀκ. 


If 
fc}, n=0,+1,+2,... 
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is any sequence of complex numbers which is square-summable: 
2 1Cn|? < co 


there is a (unique) function f in L? such that {c,} is the sequence of Fourier 
coefficients of f. 


We can also expand each function in L#(—1, 1) in a series using the 
Legendre polynomials of Example 7. The completeness property of those 
polynomials can be proved just as we did it for the trigonometric functions, 
by using the Weierstrass theorem on approximation by ordinary poly- 
nomials. There are many other complete orthonormal sequences of 
polynomials which are important: Laguerre polynomials, Hermite poly- 
nomials, etc. The associated series expansions can be used to solve a variety 
of problems. Let’s describe one such problem—to solve an integral 
equation. 


EXAMPLE 8. Let g be a continuous function on the interval [a, b]. Let 
K be a continuous function on the square [a, δ] Χ [a, δ]. We want to solve 
the equation 


s(x) =S(x) + f SOKO, dt 


for the function f. There is associated with the kernel K a complete ortho- 
normal sequence in L?(a, δ). The kernel K defines a mapping T from 
L*(a, 6) into L*(a, δ): 


(Thx) = [ OK, ἢ ἀ, fe Le 


Actually, 7f is a continuous function so that 


L?(a, b) wae (Ια, δ]. 
Note that 7 is a linear transformation: 
T(cf, + ζω) = cTf, + Th. 
Suppose X is real-valued and symmetric: 
K(t, x) = K(x, ἡ). 

It can be proved that there is a complete orthonormal sequence { f,} in 
L?(a, δ) such that each f, is a characteristic vector of the transformation T: 
Thr=Antn (An © R). 

The various A,’s are not all distinct. 


Suppose one has such a sequence { f,,}. One description of how to solve 
the integral equation: 


B(x) = f(x) + Tf Xx) 
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is this: Expand 


g= 24S, 
and the solution for fis 
fey 
where 
A, = Cy + Ἅ,6,» 
1.8: 
Cn ΞΞΞ α,(1 ἘΣ An)? 
That won’t work if 2, = —1 for some n. In the most important case, 
which is K > 0, one cannot have the difficulty because it can be shown that 
An 9 
lim 4, = 0. 


The method described then solves the equation for every g Ε L?(a, b). 


Exercises 


1. Carry out the proof that the Legendre polynomials (Example 7) form a com- 
plete orthonormal sequence in L?(—1, 1). 


2. If fand g are in L*(—7, 2), extend them periodically to the real line with 
period 2z. Show that the convolution 


(Fe) = 5 | γῶφα — ἢ at 


is in L1(—z, 2) and that its Fourier coefficients c, are given by 
Cr, = Dn 
where a,, and b, are the mth Fourier coefficients of f and g, respectively. 
3. Assume from Chapter 6 the fact 
WX |p ΞΞ ({χ4|Ρ -Ἐ τ... ἘΠ1,})», 1ap<c 


is a norm (satisfies the triangle inequality) on ΚΑ (or C”). Use this to prove that if 
fand g are simple functions, 


If + 8 lle SMF ill, + Ile Il. 


Then show that this inequality holds for all non-negative measurable functions 
and, hence, that [{- - - ||, is a norm on L?. 


4. Prove that L7(E) is complete, 1 < p < oo. (Hint: Show that (basically) the 
same proof we gave for 1.2 works for L?. 


5. If  ε L2(—7, 2), show that 


fre”) = 5 | ΩΡ, — ἡ αἱ 
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defines a harmonic function on the open unit disk {z € C;|z| < 1} and that this 
function is complex-analytic if and only if the Fourier coefficients 


Ch = ΕΞ " 7} (α)ο πὶ dx 


are 0 for n < 0. (P, is the Poisson kernel from Chapter 5.) 


6. If fis a bounded measurable function, define 


Il Fle = inf sup | g| 
5 


where g ranges over all functions which agree with f almost everywhere. Prove 
that 


I fll» = lim [If Ip 
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We are going to discuss the differentiation of maps from one Euclidean 
space into another. We begin by reviewing the basic facts which we shall 
need concerning linear transformations and partial derivatives. 


8.1. Linear Transformations 


The simplest class of (differentiable) maps from one Euclidean space 
into another consists of the transformations T which are linear: 


T(cX + Y) = cT(X)+ T(Y). 
We have referred to such maps frequently in the examples and exercises. 
In this chapter they will play a central role in the development. Let us 
summarize their basic properties. 
Let n and k be positive integers. We shall denote by L(R’, R*) the 


vector space consisting of all linear transformations from [Ἀπ into R*. The 
most basic fact about a linear transformation 


R" — >» R* 


is that T is uniquely determined by its values on any n linearly independent 
vectors X,,...,X,: 


A == CA; + seer + ¢,X, 
T(X) ve c,T(X;) ae ai si c,T(X,). 
In particular T is determined by its values on the standard basis vectors 
Ε, = (,0,...,0),..., £, = (0,0,...,0, 1). Suppose 
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(8.1) T(E;) = (αι,» 2»... 5 Qj), f ng ΟΣ 
The k Χ n matrix 

A=[aj), 1<i<k, 1<j<n 
is called the standard representing matrix for 7. The equations (8.1) com- 
bine with the linearity of T to yield this explicit description of the action 
of T: 


255 Y = T(X) 


AX'= Y'. 
In (8.2), X‘ is the transpose of XY, that is, the n x 1 matrix obtained by 


standing the n-tuple X up on its right end. Thus the second equation of 
(8.2) means: 


Qi, αι, τ) Ayn || Χὶ J 
Qo, G22 *°* Gan || X2 y2 
ἰᾶκι αἰ) 55 Ayn ji Xn Vx 


n 
Ji p> ἄς, Χ): 
jz 


There is no need to keep writing the “Ὁ for transpose, if you can remember 
that Y = 7(X) corresponds to Y = AX, where the Y and X in the second 
equation are column vectors. 

Any map from R” into R* is (or, can be) described by a k-tuple of 
real-valued functions on R” 


R" —> R* 
Y=T(X) 
(8.3) V1 Ξε (στ .... Xn) 


Vk =fi x1, e028 9 Xn) 


The function ἢ is defined by “f,(X) is the ith coordinate of T(X)”. In other 
words, ἢ is the composition ἢ, = 2; o T, where 7; is the ith standard coor- 
dinate function on R*: 


m(Y) = yi. 


The map T in (8.3) is linear if and only if each ἢ is linear (a linear func- 
tional). 
If fis a linear functional on R’, then f has the form 


(8.4) F(X) = aX, + +++ + a,x, 
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where a,,..., a, are some fixed real numbers. The matrix [a, --- a,] 
is the standard representing matrix for the linear functional f. It is usually 
identified with the vector A = (a,,...,4a,) so that (8.4) reads 


(8.5) F(X) = <A, X> 
Or 
f(X) = A-X 


depending on your notation for inner (or dot) product. 

Just remember this about the two standard ways of describing a linear 
transformation 7. It is uniquely determined by the images of the standard 
basis vectors in Κα, The image of ΕἾ under T is found by reading down 
the jth column of the representing matrix A. The transformation T is also 
determined by a k-tuple of linear functionals on Κα: T= (f,,...,/f;). 
The form of f; is 


where 4, is the ith row of the standard matrix A. 


From time to time, we shall have occasion to talk about affine maps 
on R’. An affine transformation from R” into R* is a map 


R" Ν R* 
which has the form 
F(X) = Yo + T(X) 
where Y, is a fixed vector in R*¥ and T ε L(R’, R*). Thus, an affine trans- 
formation is a linear transformation followed by a translation. The linear 


transformations are the affine transformations which carry 0 into 0. Think 
of a real-valued function on the real line: 


f 
R——~ R. 


To say that fis affine means that the graph of fis a straight line. To say 
that f is linear means that the graph of f is a straight line through the 
origin. 


8.2. Partial Derivatives 


There are certain definitions and facts about differentiation which you 
should be familiar with from Chapter 4. We review them here for con- 
venience. Let f be a real-valued function defined on an open set in R’. 
The partial derivatives of f (if they exist) are 


(8.6) (δ, f)(X) = lim LA HE) — FOO) 


We will also denote D,f,..., D,f by of/dx,,..., 0f/dx, when it seems 
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useful to do so. The function fis of class C! on the open set Uif D, f,..., 
D,,f exist and are continuous functions on U. 

If fhas partial derivatives, we can ask whether they are differentiable. 
The second order derivative (if it exists) 


D(D,f) = 57 


is denoted by ὃ, ,f. One of the most basic theorems here is that, if D, ,f 
and D, ,f exist and are continuous, then 


D, jf τ ρ,, it 
(8.7) Of Pf 
Ox;0x; OX; OX; Ox; 


The function fis of class C* on the open set U if all Ath order deriva- 

tives , 
seney LJ = D,, D,, cas Dz, f 
(8.8) 7 of 
" ὃχ,, ΡΣ Ox;, 

exist and are continuous on U. Of class C* means of class C* for all k. 
In dealing with higher order derivatives, it is convenient to use vector 
indices. If i,,..., ἐκ are positive integers between | and n, then we let i = 
{προς ἔκ) and 
(8.9) D,f ΞΞΞ D,, 

Note that each of the differential operators D,,..., D, is a linear 
transformation 


D; 
c'(U) —> C(U) 


from functions of class C! into continuous functions. Of course, we also 
have 


D; 
CK(U) —> C*-'"(U). 


The second order differential operator 


Di, j 
cu) —> CX) 
is actually the composition D; ,; = D; ° D;: 


D; 
C* > Cro! 
Di, ; | D; hal 
Ck-2 


More generally, (8.8) really means that if i = (i,,...,7,), then ἢ), is the 
composition of the linear transformations D,,,..., D,,. 
The concept of partial derivative (8.6) is a special case of directional 
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derivative. The derivative (D,f)(X) measures the rate of change of fat Y 
along the line through X which is parallel to the jth axis. Rates of change 
along other lines are equally important. If V is any vector in R’, then V 
determines a line through the origin—the 1-dimensional subspace spanned 
by V. If we translate that line so that it passes through X, we have the line 


1, -- (Χ + tV);t € ΑἹ 
and we can try to differentiate fat X along L: 


(8.10) (Dy f XX) = lim AA) SO) 


The derivative (D,f)(X) depends not only on the line L but on the vector 
V as well, because in (8.10) we used a scale and direction along ZL which 
depended on V as a unit. Sometimes, we may sloppily refer to (D, f)(X) 
as the derivative of f at X in the direction of the vector V; but, we shall 
try not to because D, f depends upon the length of V as well as its direc- 
tion. If we let 

᾿ 
IVI aie 
then W is the vector of unit length which has the same direction as V and 
it is (Dy f)(X) which is properly called the derivative of fat X in the direction 
of the vector V. Of course, 


W = —- 


l 
Dy f = Ps. 


That is a special case of the fact that D, fis a linear function of V (if fis, 
say, a fixed function of class C'). For a fixed V, then D, is a linear trans- 
formation 


Dy 
Cc1(U) —> C(U) 
and 
Dayv,+v, = aDy, + Dy,. 
In particular, if V = (v,,...,¥,), then 
Dy, = ,D, +--+ + ,D, 
(8.11) 
ὍΣ ῈΞ ΣΙΝ Moder a vo, fec. 


If fis of class C! on the open set U, the gradient of fis 


f' = (Dif, ..., Dif) 
(8.12) ὦ af ΒΡΩ͂ 
Ox, ὄχ 


Note that f’ is a (continuous) ΠΙῈΡ from U into R’. According to (8.11), 
the gradient satisfies 
Dy f = ΡΤ » 
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At any point X, the vector f’(X) gives the direction in which 715 increasing 
most rapidly. 

Each (real-valued) affine function on R’ is of class Ὁ". More generally, 
each polynomial 


(8.13) FO= Yaa anit xt 


yeeeglin = 


is of class C”. The affine functions are the polynomials of degree 1 (or 
zero). Recall the use of vector indices which enables us to rewrite (8.13) as 


Ν ° . 
f(X) = Xe ayxtt ++ xt 


Ν ᾿ 
= du a, X'. 
f= 


We have used [1] for the weight or degree of the vector index i: 
[1{:ΞΞ a, ἘΠ ia ΣΕΥ Τὰ 
It gives the degree of Χ' and it could be confused with the Euclidean 


length of the vector iin R". Let’s solve that problem by agreeing not to get 
confused by that. The coefficient a, is given by 


a, = (Dif )(0) 


where i! = i,!--- i,!. 

You are supposed to know Taylor’s theorem, which is of interest in 
attempting to approximate a smooth function by polynomials. The form 
of that theorem which we shall use says this. Let f be a function of class 
C**! on a convex neighborhood U of the point A = (a,,...,4a,). Then 


γα) = 5 ADSMAA — AY + RX) 


ij=0 
where the remainder R,(X) is given by 
a (X — A [" (ρι k 
R(X) = (k + 1) >» ae] Na+ t(X — A))(1 — δ" dt. 
=k+ . 0 


A special set of functions of class C* on U consists of the real-analytic 
functions on U, those such that for each point A € U they can be expanded 
in a convergent power series 


(8.14) f(X) = = o(X — AY. 


If fis real-analytic on U and if A is a point of U, then the coefficients c; in 
(8.14) are uniquely determined. They are 


οι -- FDA) 


and there is a largest symmetric box about A 
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B= X;5| xX, α,] Ξ 6,0 = 1,..4,50)} 


in which the series (8.14) converges. 
Bear in mind that the real-analytic functions are extremely special. 
We shall be working more with the classes C* and (Ὁ. 


8.3. Differentiable Functions 


What is a differentiable function on R"? We could define it as a func- 
tion f for which the partial derivatives D, f,...,D,f all exist. There are 
several difficulties with that as a definition of differentiable function. First, 
the mere existence of (D, f)(X),...,(D,f)(X) does not imply that f is 
differentiable in the directions of other vectors. (If D, f,..., D,f exist and 
are continuous throughout an open set U, then there is no difficulty with 
directional derivatives.) Second, most mathematical definitions which 
depend upon a particular choice of coordinates lead to conceptual diffi- 
culties at some point. Third, what is going to play the role of the derivative 
of f? It would be nice to have “differentiable” mean that the “derivative” 
exists. 

One way to counter the first and second objections is to require for 
differentiability at X that (D, f)(X) should exist, for every vector V in R’— 
to require differentiability in every direction. The astute reader may be 
able to guess that that’s not quite enough to require. Suppose that all the 
directional derivatives existed but the formula 


(Dy (Δ) = (Di f)(X) + --- εν, A(X) 
broke down. That would create havoc. We want (D,/f)(X) to be a linear 
function of V. So, we could say: fis differentiable at X if every directional 
derivative (D,/f)(X) exists and if (D, f)(X) is a linear function of V. What, 
then, would play the role of the derivative of f? Answer: The linear func- 
tional ¢: | 
{V) = (Di Δ) 

will be the derivative of f at Χ. 

Now, we are going to start afresh on the question: “What should 
differentiable mean?” and look at it from a geometrical point of view. 
We'll come to the same conclusions which we just reached, but in a some- 
what neater form. 

Suppose f is a real-valued function on an interval of the real line. If 


' x is a point of that interval, the derivative 


fi») = lim I(x + ) — f(x) 


(if it exists) is the slope of the tangent line to the graph of f at x. Indeed, 
the definition of tangent line involves the limit of chords—a limit which 
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y=f(x) + (t-x) f(x) 


FIGURE 28 


is the geometrical analogue of the definition of f’(x). The tangent line has 
the equation 

y(t) = f(x) + ( — x) f(x) 
and is the line which “best approximates” fnear x. (See Figure 28.) What 


(precisely) does that mean? It means this. Suppose we search for an affine 
function 


a(t) = b(t — x) +c 
to approximate f(t) near x. If we want 


lim [f(t) — α(ἢ)] = 0 


the constant c must be chosen as f(x). If we take c = f(x) and ὁ = f(x), 
then we get 


tim LO — a) _ 


t>x 


1.e., f(t) — a(t) goes to zero faster than t — x as t approaches x. 
Quite often, it is more convenient to phrase the last remarks this way: 


f(x) = lim 7.5.3: Ω — f(x) 


(8.15) f(x ᾿ h) = f(x) + hf (x) + Κα; x) 
lim — Rh; x) = 0. 


μ-0 


If x is fixed throughout some given discussion, analysts will often write 
(8.15) this way: 


f(x + h) = f(x) + Af (x) = ofA) 
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where o(A) is a generic expression for any function of A which goes to zero 
faster than h does (as A tends to 0). The little “oh” may denote many dif- 
ferent functions in the same discussion: 


h? = o(h) 
h? = oth) 
h τέ o(h). 


The point of such a notation is that in approximation problems such as 
the one we have been discussing we obtain approximations such as (8.15) 
in which we don’t care precisely what the remainder R(h; x) is. All we care 
about is the fact that | 


R(x; h) = o(h). 
Now, let’s get back to the case of a function of several variables: 
U— ΚΑ, Uc R". 


Fix a point X in U. Differentiability of fat X will correspond to the exis- 
tence of a tangent “plane” to the graph of f at the point (XY, f(X)), as 
indicated in Figure 29. The tangent “plane” will actually be n-dimensional 
—a hyperplane in R"*'!—and it will be defined by an affine function 


y= aT) 
=cC, + af, -- ΟΜ + 4,th 
Since we want the hyperplane to pass through the point (X, f(X)), it is 


FIGURE 29 
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more convenient to write the affine function as 
aT)=c+6,¢, —x,)+---+56,(, — x,) 
=c+ {(7 -- ΧῚ 
where ¢ is a linear function. Clearly we want c = f(X). What will it mean 
to say that we can find some hyperplane 
y=f(X) + UT — X) 


which is tangent to the graph of fat CX, f(X))? 

After a long hassle about the geometric meaning of tangency, we 
would eventually come around to agreeing that what the tangency means 
is that the hyperplane approximates the graph to the first order; i.e., that 
(as T approaches X) f(T) — a(T) goes to zero faster than Τ᾽ — X does. 
We can express that analytically in several ways 


.α S(T) — aT) _ 
" τ σας ge δἱ 
(ii) lim LO — - ΚΤ -- σ) .0 
(iii) F(X + H) = f(X) + (A) + R(X; A) 
where 


ea | 
—_._ R(X: H) = 0 
eg (X; H) 


(iv) S(X + A) = f(X) + (A) + o(F). 
In (iv), o( 7) means o(| ἢ |). 
Definition. Let f be a real-valued function on an open set U in R®, and 


let X be a point of U. The function f is differentiable at X if there exists a 
linear functional € on R® such that 


(8.16) f(X + H) = f(X) + €(H) + o(H). 


There is one simple thing we must verify, namely, if f 1s differentiable 
at X the linear functional € in (8.16) is unique. We see that as follows. 
Suppose that we also have 


f(X + H) =f(X) + €(H) + o( ff). 
Subtract and you conclude that 
(8.17) ((H) — €,(H) = ο(Η.. 


Remember that o(H) is not a specific function but a symbol for any func- 
tion of H which goes to zero faster than H does as H goes to 0. If you forget 
that, you'll get confused by such statements as 


o(H) — ο(Η) = o(f). 
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We want to deduce from (8.17) that € = ¢,. What we need to show is that 
if 2: 
MA) = a,h, + --- + 4,h, 


is a linear functional on R” and λ(}7) = o(A), then Δ — 0. You should 
be able to do that fairly easily. 


Definition. If f is differentiable at X, then the linear functional € in 
(8.16) is called the derivative of f at X and is denoted by (Df)(X): 


£ = (Df)(X). 
Theorem 1. If f is differentiable at X and ¢ = (Df)(X), then all direc- 
tional derivatives of f at X exist and 
(Dyf)(X) = €(V), γε ΗΚ". 


Proof. Let V be a non-zero vector in R’. We want to show that 


(8.18) ΠΕ Δ A) I) — t(V). 
t>0 
Since fis differentiable at YX, 
(8.19) F(X + WV) = (Δ) + (GV) + OV). 
Since V is a fixed non-zero vector and |tV| = |z||V|, o(tV) may as well 


be replaced by o(t). Since ¢ is linear, is = t((V). From (8.19) we have 
LEE A) = I) Ly ya + o(V) = — £(V) + o(1). 
What does that say? It is precisely i 


A particular consequence of Theorem | is that the partial derivatives 
(D, f)(X), ...,(D,f)(X) exist and 


(D; (Δ) a {(Ε). 
Thus we have 
f(V) = 0,(D, f)(X) + --- τυ, ) (Δ) 
for every vector V in ἄπ. In other words, (Df)(X) is the linear functional 
“inner product with the gradient f’(X)”. 


Theorem 2. Let f be a real-valued function on an open set U in R®. The 
two following statements are equivalent: 


(i) f is a function of class (1. 
(1) f is differentiable at each point of U and the derivative Df is con- 
tinuous. 


Proof. Assume (i), that is, assume that the partial derivatives 
D,f,...,D,f exist and are continuous functions on U. Fix a vector X 
in U. We showed in Chapter 4 that the directional derivatives of f at Y 
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exist and are given by 


(8.20) (Dy 7)(Χ) = 1(Di f (X) + +++ + UD FX). 
Define 

CV) = (DL Ὁ) (Δ). Με Κ΄. 
From (8.20) it is clear that ¢ is a linear function on R”. The question is: 
Does f(X + H) — f(X) — {(ΗῚ go to 0 faster than | H| does as H tends 
to 0? 

In the one-variable case, it is easy to see that it does. In fact, we already 
discussed that in motivating our definition of differentiable function. In 
the one-variable case, we do not need to know that ζ΄ is continuous; but, 
here is the proof for the one-variable case that parallels the proof for the 
n-variable case. The mean value theorem gives 


f(x +h) -- 70) =f (oh 
where c is a point between x and x + h. The point c depends on h; but, 
all that we need to know about it is that as A tends to 0 the point c ap- 
proaches x (because it is trapped between x and x + ἢ). Since Κ is contin- 
uous : 


lim f'(c) = ΣΟ). 
Therefore, 
f(x + ἢ) — f(x) — hf'(x) = ALS’) — f°) = off). 

Now, let’s look at the two-variable case. The n-variable case is a 
simple extension of this one. We look at f(X + H) — f(X) as a function 
of h, and then as a function of 4,. We assume throughout that H is small 
enough so that X + H stays in a square centered at X and contained in 
U. Write 
IA ff) -- f(X) = γα, + hy, X2 + he) — Κ[(χι, X2 + 12) 

+ f(%1, X2 + ha) — f(%1; X2). 
g(t) = f(t, x2 + 1). 


Then g, is a function of class C! on the closed interval between x, and 
x, +h,. Furthermore, g/(t) = (δ, f)(t, x. + A). Thus 


(8.22) gi(x, + Ay) — σι(χ) = Ay(Di f)(e1, X2 + Az) 
where c, depends upon h, but lies between x, and x, + h,. If we define 


82(t) = Υγ(χ,, ἢ 


(8.21) 
Fix μι, and let 


then we also have 
(8.23) 82(X2 + hy) — 52(Χ2) = h(D2f) (1, 63) 
where c, is between x, and x, + h,. Let 

Py = (€1,X%2 + Ay) 

P, = (%1, C2). 
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Then (8.21), (8.22), and (8.23) yield this: 
IX + A) — f(X) = hy(Di f) Pi) + (Dz f)(?P2). 
We want to know about the behavior of 


f(X + H) —f(X) — (A) 
where 
((Η) = h(D, f)(X) + h(Dif)(X). 
We have 


(822) μὰ +H) -- ΚΑ) -- ((Η) = YAO NE) -- ONO 


The points P, and P, depend upon H. What is important is that each of 
them has its ith coordinate between x, and x, + A, so that 


H-0 
Since each D,f is continuous, it should be reasonably apparent from 
(8.24) that 
ΚΑ + H) — f(X) — (A) = o(F). 


We have proved that / is differentiable at each point of U. We have 
also asserted in (11) that the derivative Df is continuous. What does that 
mean? For each X, (Df)(X) is a linear functional on R’. Thus Df is a map 


Df 
U —> L(R’, R'). . 
What we have asserted is that (Df)(X) depends continuously on X. The 
continuity is defined relative to some norm on the space of linear func- 
tionals L(R”, R'). Since the space is finite-dimensional, it doesn’t matter 
which norm we use, because all norms on L(R’, R') will define the same 
convergent sequences. For instance, we could use the “natural” norm 


Ell = sup EQ) 1. 
This turns out to be identical with 
4 |] = [4] 
where A is the vector in R” which determines ¢: 
((Η) = <H, A> = αι, + azh, + +--+ + 4,h,,. 


We know that the linear functional ὁ = (Df)(X) is determined by the 
vector A = f’(X). Since the partial derivatives D,f,..., D,f are con- 
tinuous, it is clear that f’(X) depends continuously on S. 

There is a second statement in Theorem 2, that (1) follows from (ii). 
We leave that as an exercise. 
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Exercises 


1. If ¢ is a linear functional on Ἀπ, the norm of ¢ is 


= [ocx |. 
él] = sup “Ty 


Prove that 


(a) ||€|| < οο; 


(Ὁ) || €|| is the smallest non-negative number M such that 
[{(Χὺ] Ξ M|X|, ἴογ all X; 
(c) ||€|| =|A|, where A is the unique vector in R” such that 
((Χ) --  (Χ, 45, Xe Κ᾿. 
2. Prove that, if ¢ is a linear functional on R" and 


{(H) = ο(Η) 
then { = 0. 


3. What does it mean to say 


f(H) = o(1) 


for small H? 
4. If fis differentiable at X, then fis continuous at X. 
5. If f(x, y) = xy? + x3 y4 and € = (Df)(5, 6), what is €(7, 8)? 


6. Let <-,-> be an inner product on R” and define NCX) = <X, X>!/2. Prove 
that N is differentiable at every point except X = 0 and 


1 
(DN\(X) = ΝΟ) xX». 
7. If fis a differentiable (real-valued) function on a connected open set U < R"” 
and if Df = 0, then fis constant. 


8. Let f be a differentiable function on the open set U. If f has either a local 
maximum or local minimum at the point X ε U, then f’CX) = 0. 
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Now we consider a map 


F 
U —> R*, U < R" 
that is, a vector-valued function on an open set U in R”. What shall it mean 
for F to be differentiable at a point Χ 7 Here, we do not try to give a geo- 


metrical motivation. Differentiability will mean that it is possible to ap- 
proximate F sufficiently well by a linear function from R’ into R*. 
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Definition. Let F 
F 
U — RY, Uc R? 
be a map defined on an open set U in R", mapping U into R*. Let X be a 
point in U. We say that F is differentiable at X if there exists a linear trans- 
formation L 
L 
R" —> Rk 
such that 
(8.25) F(X + H) = F(X) + L(H) + o(F). 
If F is differentiable at each point of U, we call F a differentiable map. 


This definition subsumes the case of a real-valued map. Another 
elementary remark is that differentiability implies continuity. 

If F is differentiable at X, then the linear transformation L in (8.25) 
is unique; it is called the derivative of F at X and denoted by L — (DF)(X); 
the standard representing matrix for (DF)(X) is denoted by F’(X) and 
is called the Jacobian matrix of F at X. Note that this is consistent with 
our previous notation ζ΄, for the gradient of a function. 


Theorem 3. Let F = (f,,..., ἔμ) be a map from an open set U into R*, 
and let X be a point of U. The map F is differentiable at X if and only if each 
f, is differentiable at X. If F is differentiable at X, then 


(DF)(X) = (Df;)(X), ... (ΟΠ (X)). 


Proof. The map F is given by a k-tuple of real-valued functions: 
JX) is the ith coordinate of f(Y). In other words, 


Y = F(X) 
Vi =Sfi%1,~..- 5 Xy) 


»κ Ξε κίχ,,...., χ,). 
If each ἢ is differentiable at X, then for each i we have 
(8.26) f{X + H) — f{X) — ἐκ) = «(Η) 
where ¢; is a linear functional on R". The map L = (¢,,..., €,) is then 


a linear transformation from R” into R*. We assert that F is differentiable 
at X with derivative L. Let 


G(H) = MX + H) — F(X) — L(A). 


According to (8.26), for each i, the ith coordinate of G(#) goes to 0 faster 
than | H| does as H tends to 0. Therefore, 


G(H) = o(#). 
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Conversely, suppose that F is differentiable at Χ with (DF)(X) = L. Now 
L is given by a k-tuple of real-valued functions L = (¢,,..., ¢,). Since 
L is linear, each ὦ, is linear. The norm of f,X¥ + H) — F(X) — €(H) does 
not exceed the norm of F(Y + ΠῚ — F(X) — L(A). Therefore, we have 
(8.26), which is what we want. 


With Theorem 3, we can build many differentiable maps by combining 
k-tuples of differentiable functions. From the relation L = (¢,,..., ¢,) 
which we have just been discussing, we see the following. The linear func- 
tional ¢; is determined by the gradient of ἢ at X. Therefore, the Jacobian 
matrix of Fis given by 


Ff, ... Of 
Ox, Ox, 
ΡΝ | 
fe δα 
Ox, OX, 


For that reason, other standard notations for the Jacobian matrix of the 
map F = (/,,...,/,) are: 


dF dY 

(8.27) IX or aX 
Fs (Προ dk) ἐτ-- OV ihe Vp) 
(5:28) es O(X1,.-25 Xn) oe πἘ ἘΞ δίχι...., X,). 


This notation is frequently handy. 
Now, let’s take a look at some examples of maps which arise naturally, 
without any reference to coordinates. 


EXAMPLE |. Let Z be a linear transformation 


R" —> R*. 
Then L is a differentiable map and (DL)(X) = L for every X, because 
ΠΑ + H) = L(Y) + L(A). 
The Jacobian matrix for L is (constantly equal to) the standard representing 


matrix for L. 


EXAMPLE 2. Suppose that we identify R? with the field of complex 
numbers by identifying the point X = (x,, x,) with the complex number 
Ζ = (x, + ix,). Then we can define a map from R? into R?2 by F(z) = z?. 
More precisely, F(x,, x.) is the point in the plane which corresponds to 
the point (x, + ix,)?. That means that 


F(X 1, X2) = (xi — x}, 2x,X2). 
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or 
Y = F(X) 
where 
Vi =Sfi(X1, X2) = xt — x} 
V2 το [ω(χι, X2) = 2%, x3. 
Since ἢ, and f, are polynomials, clearly F is a differentiable map. The 
Jacobian matrix for F 1s 


F(X) = | * eal 


X2 x; 


so that the derivative of F at X is the linear transformation 


un fe 8 
(8.29) Xe. al A 


= 2(x,hy — X2h2, Χχῆι + xX,h)). 
Notice what (8.29) corresponds to when phrased in terms of complex 
multiplication. If X corresponds to the complex number z and H cor- 
responds to the complex number w, then (8.29) can be rewritten 


(8.30) L(w) = 22w. 
That is, the derivative of the map F at the point z is the linear transforma- 
tion “complex multiplication by 2z”. 
That is quite easy to see without reference to coordinates, matrices, 
etc. We have 
F(z + w)=(z+ w)? 
τ Ζ2 -ἰ 2zw + w?2 
= F(z) + 2zw + w?. 
With z fixed, 2zw is a linear function of w. Also w? = o(w). Therefore, it 
is immediate from the definition of derivative that (DF)(z) = L, where L 
is defined by (8.30). 


EXAMPLE 3. In Example 2, it is tempting to say, “Of course z? is differ- 
entiable and of course its derivative is 2z because the function F(z) = z? 
is complex-differentiable and its complex derivative is 2z: 


lim £2 + ν) — Az) 2,» 
w—0 w 
But that is not the same derivative as the one which we are discussing in 
this chapter. The function F(z) = z* is not complex-differentiable; how- 
ever, it is differentiable as a map from R? into R? because it is a linear 
transformation on 2. Remember that complex-differentiability is a very 
special property. 

Still, the occurrence of 2z in (8.30) could hardly be an accident. Let’s 
see why. Let U be an open set in the plane and let 


F 
U—>C. 


Sec. 8.4 Differentiable Maps 


Suppose that F is complex-differentiable. Let F’ be its complex derivative. 
Then 


F(z + w) = F(z) + F’(zw + o(w) 


because that is the definition of complex derivative. Now F'(z)w is a linear 
function of w. Therefore, the map which F defines from R? to R? is differ- 
entiable and the derivative of F at z is “multiplication by the complex 
number F’(z)”. Notice that the F’ here is not the Jacobian matrix. We 
hope that this example clears up more confusion that it causes. 


EXAMPLE 4. Fix a positive integer n. Let R"“" be the space of n x n 
(real) matrices. Then R"*" is isomorphic to π΄, so that we may discuss the 
differentiability of maps on πη. Here is an interesting one. Let U be the 
set of invertible n < n matrices. Recall that U is an open subset of R”~’. 
The proof of that runs as follows. Suppose A is an invertible n x n matrix. 
Let’s show that A + A is invertible for all matrices H of small norm. We 
rewrite 

A+ H= AU + A™'A). 


Now I + ΑΓἸΗ will be invertible provided | A~!H| is small, because we 
can use the power series for 1/(1 + x). Choose ὃ > 0 so that δ[4 1] < 1. 
Then 


(A+ AY'=(4 AH) 1A? 


(8.31) 
= A“'!— A-'HA"'!4 A''HA'HA'—.--, |H|<o. 
Now let F be the inversion map on U: 
F(A) = Α΄. 


Is F differentiable? If there is any justice, F should be differentiable with 
derivative —A~*. That can’t be correct, because (DF)(A) must be a linear 
transformation from πη into R"~". Aha! It must be the linear transfor- 
mation “multiplication by —A~2”. Multiplication on the left or on the 
right ? There is no reason to prefer one side to the other, so we’ll put one 
4“ 1 on each side. We are led to guess that (DF)(A) is the linear transfor- 
mation on the space of matrices defined by 


L(A) = —A'HA™, 


We can change this rapidly into more than a guess. Look at (8.31). 
It tells us that ; 


F(A + H) = F(A) — A“!HA™! + A“!HA™'!HA-1 — --- 
so that 
| F(A + H) — F(A) — 4 ΗΑ4 |< > | H|F| Am} e+ 
k=2 


oO 


== EP De ΗΠ | Aa Ὁ 


k=2 


= o(| H]). 
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There are a few basic facts about combining differentiable maps which 
we will now take up. If F and G are differentiable maps into R* and if they 
are both differentiable at X, then F + G 15 differentiable at XY and D(F + G) 
= DF - DG. That is easy to verify. Let’s turn to the product of differ- 
entiable maps. 


Theorem 4. Let U be an open set in Ἀπ. Let F be a map from U into ἘΞ 
and let f be a real-valued function on U. If both f and F are differentiable at 
X, with respective derivatives € and L, then the product fF is differentiable 
at X and its derivative M = (Df)(X) is given by: 

M(H) = ¢(H)F(X) + f(X)L(A) 
that is, 
(fF’)(X) = f’COFR) + ΓΕ ΌΣ. 
Proof. Let € = (Df)(X), L = (DF)(X), and 
g(H) = f(X + H) —f(X) — (A) 
G(H) = F(X + A) — F(X) — L(A). 
Now 
(fF)(X + A) — (fF)X) = (F(X + 1) —-f(X)FX + A) 
+ f(X\LF(X + H) — F(X)) 
= ((H)FX + H) + ΧἨ ΚΕῚ 
+ g(H)F(X + H) + f(X)G(A) 
= (A) X) + f(X)L(A) 
+ €(H)[L() + G(H)) 
+ g(H)F(X + AH) + f(X)G). 
Now, all we have to do is to show that the sum of the last three terms is 
o(H). Since G(A) = o(A), g(H) = ο(ΗῚ and since F(X¥ + A) tends to 
F(X), each of the last two terms tends to zero faster than | H|. We can dis- 
pose of €(H)G(#) in a similar way, leaving us with €(H)L(A) to worry 
about. Now 
(A )L(A)| < |] EU] (LI |? 
because ¢ and LZ are linear, and that shows that €(H)L(A) = o(#). 


We remark that Theorem 4 is valid for the product of two matrix- 
valued functions, with essentially the same proof. Here is a very funda- 
mental way of combining differentiable maps. 


Theorem 5 (Chain Rule). Let U be an open set in R® and let V be an 
open set in R*. Let F and G be maps 


F 
U —— R* 


G 
V—> Εἴ 
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such that F(U) < V. Let X be a point in U. Suppose that F is differentiable 
at X and that G is differentiable at F(X). Then the composition G o F is 
differentiable at X and 


(8.32) D(G ο F)(X) = (DG)(F(X)) o (DF)(X) 


Proof. Let L = (DF)(X) and M = (DG)(F(X)). What the theorem 
asserts 1s that Go F is differentiable at XY and its derivative is the linear 
transformation M o L, from R* into R?: 


L M 
R" —>» R* — > R’. 
Let Y = F(X). By the definition of derivative 
F(X + H) = F(X) + L(A) + |A|R) 
G(Y + K)=G(Y) + ΜΚ) + |K|S(K) 


where 
lim R(H) = 0 
H-—0O 
lim S(K) = 0. 
K--0 

Now 


(Go F(X + H) — (Go F)(X) = σὰ + H)) — GF(X)) 
= G(Y + L(H) + |H|RUA)) — G(Y) 
= G(Y + K) — G(Y) 
= M(K) + |K|S(K) 
where K = L(H) + [Π| ΚΑΗῚ. Since M is linear, 
M(K) = M(L(H)) + M(|H|RU1)) 
= M(L(H)) + |H| M(RCH)). 
The result will be proved if we can show that 
| 1| M(R(H)) + | K|S(K) = o(77). 
The left-hand side (above) is not larger than 
| ||| M ||| RGD)| + [| LGD) + | A) RC) ISK) | 


« [Π||1Μ||1Κ(}}}} + ILI SCK)] + | RCA)|| SCK) UI. 
Now 
lim K = 0 


H-0 
so that 
lim S(K) = 0. 
H-0 


Since ΚΠ) tends to 0 as H does, the result follows. 


In terms of the Jacobian matrices, Theorem 5 says 
(8.33) (Go F)'(X) = G'(F(X)) F(X). 
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When calculating, it is often most convenient to phrase the chain rule this 
way 


Υ = F(X) 
Ζ =G(Y) 
dZ _dZdY 
dX dY dX 


Or 


O(z1,. ae 52) = O(2,,.. ‘ » Zg) OV 1>- oie » Vx) 
OX 15s 2 ~3:Xn) OV 15 - ++ > Ve) OX1, ~~. Xn) 


In other words, 


(8.34) τὸς ὡς te ads 


It is important to remember exactly what (8.34) means. It means this. 
Suppose we are given /,,...,/,—differentiable functions on an open set 
in ας, They define a map F, variously described as 

Y = F(X) 
or 

γι Ξ (αι... Χο) 


(8.35) 


Vk =filX1 ae) Xp) 
Then suppose that we have g,,..., g;—differentiable functions on (an 
open set containing) the range of F. They define a map G, variously de- 
scribed as 

Z=G(Y) 
or 

Ζι = 81()15--- > Ve) 


(8.36) 


Za = BAVio--+ 9 Vx): 


Now we “express the z’s as functions of the x’s” by substituting 
SAx1,.-.,X,) for y, in (8.36). That describes the composed map H = GoF 


Z = H(X) 

ay = WAX is wee 9 x) 
(8.37) : 

29 SNA Xs Oot ΧΟ 


Remember that 
PAX i520 x) = GCL Miss ocak hes nr ἌΣ 


Sec. 8.4 Differentiable Maps 39] 


The meaning of (8. = and the content of the chain rule is that 


(8.38) Fe) = 3 SER) SED. 


Now -. in one it is nabeene to keep writing x,, fj, y,; 
5,. Z;, and h, if you’re trying to calculate something. Either (8.38) or (8.32) 
should be used as the chain rule when you’re trying to understand care- 
fully what’s going on. But in calculations it is usually simpler to say that 
the map F is specified by giving the y’s as certain functions of the x’s: 


Vr = VAX 1, + 2 Xn) 
and that the map G Is specified by giving the z’s as certain functions of the 
ys: 

Ζι = 24V1y ~~ + 9 Ved» 


Under those constraints, the rate of change of Ζ; with respect to x, is 


<1 ΟΖ, OY, 
ae ἘΣ ὃν, OX; 


All you have to remember is what’s a function of which and where to 
evaluate the derivatives. 


Theorem 6 (Mean Value Theorem). Let g be a differentiable function 
on 1), a convex open set in Ἐπ. Let A and B be points in U. There exists a 
point P, on the line segment from A to B, such that 


g(B) — g(A) = <B — A, g(P)>. 
Proof. Let 
Fit) = tB+ ( — OA, O<r<l. 
The image of F is the line segment from A to B. Let f= go Fy ie, 
f(t) = g(tB + (1 — 4). 


Then f is continuous on the closed interval [0, 1] and differentiable (at 
least) on the open interval (0, 1). Furthermore, 


κὸ = 9 9 9X; 
PO = Dox, δι 
or, more precisely, 


PO =X seh 


= 3b, — α; DZEFO) 
= <B— A, f'(F(O)>. 


The mean value theorem in one variable tells us that there is a number 
c,0 <c< 1, such that f’(c) = f(1) — ΑΘ). Let P = Fic). 


392 


Differentiable Mappings Chap. 8 


Exercises 


1. Repeat Exercises 1 and 2 of Section 8.3 for linear transformations rather 
than linear functionals. (Of course, the vector A in Exercise 1 will be replaced by 
the standard representing matrix A.) Prove the uniqueness of the derivative of a 
differentiable map. 


2. A differentiable map is continuous. 


3. Let U be an open set in R” and let F and G be differentiable mappings from 
U into R*. Define the real function <F, ΟΣ by <F, DCX) = <F(X), GX). 
Prove that <F, ΟΣ is differentiable on U and that (in a suitable sense) 

(DCF, G(X) = (DF)(X), G(X)> + F(X), (DG)MX)>. 
4. State and prove an analogue of Theorem 4 for matrix-valued functions. 
5. Fix a positive integer k, and let 


F 
Rex —_— Ὸ, R* xn 
be the map onn Χ n matrices defined by F(A) = A*. Show that F is differentiable 
at every point of R**", and find (DF)(A). 


6. Use Exercise 4 and the chain rule to find the derivative of the map G(A) 
= 4. defined on the set of invertible 7 Χ n matrices. 


7. Let U be an open set in the plane and let f be a complex-valued function on 
U. Suppose that f is complex-differentiable. Let F be the map from U into R? 
which / defines 

F = (μ, υ) 
f=ut+i. 
(We distinguish between f and F so as not to confuse the complex derivative /’ 
with the gradient F’). Show that if L = (DF)(x, y), then (with proper interpreta- 
tions), 
L(a, δ) = [ + iy)(a + ib). 


8. Let F be a continuously differentiable map on a neighborhood of the origin 
in R2, F = (u, v). Let f be the associated complex-valued function f = u + iv. 
Show that f cannot be a square root of the identity function z, 1.e., that 


Lf, y)}* =x + iy 
cannot hold on a neighborhood of the origin. 
9. If Fis a differentiable map from an open set U into R*, then the derivative is 
a map 
DF 
U —> L(R’, R*). 


The space L(R*, R*) is finite-dimensional—it is essentially R"*. So, doesn’t it 
make sense to talk about DF being differentiable ? What kind of an object would 
(D2F)(X) be? By D2F we mean D(DF). What does D?F have to do with second 
order partial derivatives? What is a map of class C”? 
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10. The mean value theorem gives the first step in Taylor’s theorem in 7 vari- 
ables: 


FX +H) = ΤᾺ + 3 Le 
Look at g(t) = f(X + tH),0<¢t< 1, and use it to show that (if fis of class C2) 
Κα +H) = f00 ἘΣ (DN τ {ΓΦ adi] 1) 


where P is on the line segment from X to X¥ + H: Generalize to functions of 
class C**+!, Compare with the integral form of the remainder (Theorem 10 of 
Chapter 5). 
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Suppose that we have a differentiable map 


F 
U —> κ΄, U c R" 
from an open subset of R" into R*. We would like to know when F has an 
inverse map 
F-! 
(8.39) R” <«— ΕΟ) 
which is differentiable. The inverse function (8.39) exists if and only if 
Fis 1:1. If Fis 1:1, then, in order to discuss the differentiability of ΕΠ 1, 
we will need to know that F(U) is an open set. Suppose for the moment 
that F is 1:1, F(U) is open and ΕΚ! is differentiable. The chain rule then 
tells us that (at each point) 


(DF-") o (DF) = I, 
(DF) 0 (DF-') = I 


where J, is the identity transformation on Κ΄. Therefore, for any X in U, 
the Jacobian matrix F’(X) has both a right and a left inverse. So, it must 
be a square matrix (k = n). 

We conclude two things from that brief discussion. First, we need 
concern ourselves only with the case of a differentiable map 


F 
U —> R’, 7.1.2 

Second, if 15 to have ἃ differentiable inverse, then it is necessary that the 
Jacobian matrix F’(X) be invertible for each X in U. Now, we are going 
to prove that if the continuously differentiable map F is 1: 1 and F’(X) is 
invertible at each point, then F(U) is an open set and F~! is a differentiable 
map. This should be known to the reader in the special case of a linear 
transformation: A linear transformation L which is invertible maps R’ 
onto R” and the inverse L~! is a linear transformation. We shall handle 
more general maps F through approximation of them by linear transfor- 
mations. 
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Lemma. Let F be a continuously differentiable map 
F 
U—->R 
where U is an open set in ΒΝ. Let X be a point of U such that (DF)(X) is 


invertible, i.e., such that F’(X) is a non-singular matrix. Then X has a neigh- 
borhood W for which there exist constants m > 0 and M > 0 such that 


m|B — A|<|F(B) — F(A)|<M|B—Al|, A,Bew. 


Proof. It is the left-hand inequality which interests us. Let L = 
(DF)(X), so that L is an invertible linear transformation on R’, Le., 
det F’(X) + 0. To get an idea of how the proof of the lemma will proceed, 
let’s observe that 1, has the property asserted in the lemma. We know that 


| L(B) — L(A)| <||L£]| |B — ΑἹ 
for all A, Bin R’. The inverse transformation 7,7} is also linear; hence, 
|B — A\=|L-'(L(B)) — L*(L(A)) 
= [1 1(1(8) — L(A))| 
<||L> "||| LB) — L(A)|. 


Thus, 
(8.40) [1118 -- Al <|LB)— L(A) SILI [|B — ΑΙ 
for all A, Bin R’. 


Let € = 4|| L~'||"!. We know that F = (fi,...,/,) where each f; is a 
function of class C! on U, that is, /; is a continuous function on U. Choose 
a convex neighborhood W of X so that 


(8.41) [1] -- ΓΚΧῊ! «-, {πε νυν, ἄς PEW. 


Let A, B ε W. Apply the mean value theorem to each /;: 
(8.42) F«B) — f(A) = «8 — A, FP)? 


where Ρ is a point on the line segment from A to B. Combine (8.41) and 
(8.42) and one has | 


(8.43) | f(B) — f(A) — <B— A, f(X)>|< - 18 -- Al. 
The number 
«(8 — f£A)> — <B — A, F(X) 
is the ith coordinate of the vector 
F(B) — F(A) — L(B — A). 


The norm of a vector does not exceed the sum of the absolute values of 
its coordinates. Therefore, (8.43) tells us that 


(8.44) |F(B)—FA)—L(B—A|<e|B—Al ABEW. 
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Thus 
(8) — F(A)| > |L(B — 4)} -- €|B— ΑΙ 
|b Bo Al ΞΞ 8 | 
= 81. 1|1|8 --- ΑΙ, 4,8 ε W. 
It should be clear that we can also arrange that 
| F(B) — F(A)| < M|B— ΑΙ 


by a similar argument; however, as we said, that is not of primary concern 
to us. 


Theorem 7 (Inverse Mapping Theorem). Let F be a continuously differ- 
entiable map from an open subset of R® into Ἀπ. Let X be a point of the 
domain of F such that the derivative (DF)(X) is invertible, i.e., such that 
det F’(X) 4 0. Then X has an open neighborhood W such that 


(a) Fisl1:lonW; 
(Ὁ) F(W) is an open set; 
(c) the inverse map 
F-1 
Ν- F(W) 
is continuously differentiable and its derivative is 
(DF~')(F(A)) = (DF)(A)"’. 


_ Proof. Let N be an open ball centered at X such that the closed ball 
N lies in the domain of F and on which we have an inequality 


(8.45) m|B— A|<|F(B)— FA), A, BEN. 


The last lemma guarantees the existence of such a ball N and constant 
m > 0. Since F is continuous, the inequality will hold on the closed ball 
N as well. That shows us that F is 1: 1 on N because if F(A) = F(B) with 
A, B ε N, then (8.45) implies that B = A. Therefore, we could take W = 
N and satisfy condition (a) of the theorem. 

We will have to shrink Ν a bit in order to get a neighborhood which 
satisfies condition (b) as well as condition (a). First, let us shrink Ν a little 
so that (DF)(A) is non-singular at each point of N. We can do that because 
F is of class C! so that det F’(A) is a continuous function (which does not 
vanish at A = X). Let Y = F(X). We want to show that Y is in the interior 
of F(N), that is, we want to find a neighborhood of Y which lies entirely 
in the range of F. To that end, let S be the boundary of the ball N: 


N = {A;|A — X|< 6} 
S= [4;]4 — X| = δ). 
The set S is compact and F is continuous; hence F(S) is a compact subset 


of R". The point Y is not in F(S) because Y = F(X), Fis 1:1 0n Nand 
X ¢ S. The situation is represented by Figure 30. Let r be the distance 
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5 Ε(5) 


> 


FIGURE 30 


from Y to the compact set F(S): 
r = dA(Y, F(S)) 
= inf | -- F(A)|. 
AES 


We shall show that F(V) contains the open ball of radius r/2 about the 
point Y. Let Z be a point of R” with 


᾿ 
IZ—¥i<s- 


Let’s look at the distance from Z to the set F(N) 
d(Z; F(N)) = inf {|Z — F(A)|; A € N}. 
Since |Z — Y| < r/2, we know two things: 


d(Z, F(N)) < > 
(8.46) 


d(Z, F(S)) > +. 
The function g(A) = |Z — F(A)|? is continuous on N and so there exists 
a point B € N such that 
|Z — F(B)? = d(Z, F(N))?. 


Since F is 1: 1 on N, (8.46) tells us that B ¢ S, that is, B is in the open ball 
N. Thus, the function 


aA) = Σ (ἱ -- fA)? 


has a minimum on WN at the point 8. That implies that the partial deriva- 
tives of g all vanish at B 


(8.47) 0 = 28 £(B) = ΖΣ (4, — f(B) Le L(B). 


The sum on the es ἮΝ of (8.47) 15 the jth ane of the vector 
(matrix) 

[Z — f(B)IF (2). 
We chose Ν so that F’(A) was invertible for every A in N. Thus the simul- 
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taneous equations (8.47) imply that Z — F(B) = 0. We have proved that 
Be F(N). 

Let W = {A © N;|Y — F(A)| < r/2}. Then W is an open neighbor- 
hood of X which satisfies conditions (a) and (b) in the conclusion of the 
theorem. Condition (c) can be verified for our set W as follows. Since F 
is 1: 1 on Wand F(W) is open, it makes sense to ask whether ΕΠ! 

F-! 
W <— FW) 
is differentiable. We do know right away that ΕΓ! is continuous, since F 
is continuous and 1:1 on W and W is compact. The continuity of Ε΄: is 
also apparent from the inequalities in the last lemma. Let’s show that 
F~' is differentiable at the point Y = F(Y) and that 


(DF')(Y) = L™ 
where L = (DF)(X). The proof for any other point of W is the same. 
For points Z in F(W), define 
(8.48) R(Z) = Ε (2 — ΕἸ) --« τώ — Y). 
We must show that 


ΚΖ) 
(8.49) lim ΓΖ --] = 0. 


Apply the linear transformation L to both sides of (8.48), We obtain 
(8.50) L(R(Z)) = L(A — X) — [F(A) — F(X) 

where A is the unique point in W such that F(A) = Z. Since F is differen- 
tiable at X¥ and (DF)(X) = L, (8.50) guarantees that 


(8.51) lim ΚΦ ἘΠῚ 
Since 
| L(R(Z))| > Δ Γ᾽] R(Z)| 
[4 -- X|<2||L'||| FCA) — FX) | 


(8.51) implies (8.49). 


Definition. Let U be an open subset of R® and let F be a differentiable 
map from U into Ἀπ. The Jacobian of F is the function det F’, from U into 
Ree, 


Theorem ὃ (Inverse Mapping Theorem). Let U be an open subset of R® 
and let F be a continuously differentiable map from U into R® such that the 
Jacobian of F is nowhere 0 on U. Then F is an open mapping and F is locally 
1:1. ΧΡ F is 1:1 on U, then the inverse map 

F-1 


U <— F(U) 
is continuously differentiable and (F~')’(F(X)) = [Ἐ (Χ)] 1. 
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Proof. Let V be an open subset U. Is F(V) an open set? Yes, because 
(according to Theorem 7) each point X <€ V has a neighborhood W which 
F maps onto an open set, and we may take W small enough so that W c V. 
The neighborhood W also can be chosen so that F is 1: 1 on W—that is 
what we mean by saying that F is locally 1:1. The statement about ΕΓ 
is immediate from Theorem 7. 


EXAMPLE 5. Polar coordinates in the plane arise from a differentiable 


map. Let 
U = {(r, 8) € Κῶ: r > 0} 
and define 


Ε 
U —~> R2 


F(r, 8) = (x, y) 


by 


where 
x=rcos8@ 


y=rsiné. 


Obviously F is a map of class C*. The Jacobian matrix of F is 


F’(r, 0) = 


[0050 —rsiné 

7 leg 6 r COS a 
The Jacobian of F is 

det F’(r, 8) = r(cos? θ + sin? 0) 
aay 
Thus, det F’ is nowhere 0 on U so that F is locally 1: 1 and an open map- 
ping. The range F(U) is the punctured plane R? — {0}. 
Notice that F is 1: 1 on any slice 


ax<@<b 


for which b — a < 27. The image under F of that slice is an open sector 
in the plane. (See Figure 31.) On such a sector, we may use r and @ as “coor- 
dinates,” meaning that for each point (x, y) in the sector there is one 
and only one number pair (r, θ) such that r > 0,a <6 < b,x =rcos@, 
y = rsin @. On that sector, r and 9 may be expressed as “smooth” func- 
tions of x and y. 

Without the explicit formulas 


r= (x? + γ3)"" 


θ = (ἀπ = (suitably interpreted) 
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FIGURE 31 


we can see something about derivatives. We know that 
(DF ')(x, y) = [(DF)(r, 8" 


and thus 
cos@ —rsin@]"! 
DF-")(x, y) = 
( οὐ οὶ ~ 0 r COS ἢ 
7 aw pon 
 rl—sin@ cos@ 
Therefore, 
Or Or _ 
Fx “9560. ας ἀν 
06 1 og 1 
ay = > Sin 8, Sa cos 8 


Are you sure you understand what these equations mean? 


EXAMPLE 6. The spherical coordinates for a point (x, y, z) in R? are 
are the numbers r, 0, ᾧ defined by 


x=rcos@sing 

y=rsin@sing 

z=rcos@. 
If (x, y, 2) ~ (0, 0, 0), there is a unique such triple with r>0,0<@< 
2π, 0 -- $< πί2 (Figure 32). The map F(r, 6, 6) = (x, y, z) behaves a 
great deal like the polar coordinate map in the last example. The Jacobian 


of F is —r? sin ᾧ, which underscores the fact that spherical coordinates 
do not serve well along the z-axis where ¢ = 0. 


EXAMPLE 7. Suppose that we have any continuously differentiable 
map 


F 
U —»> R" 
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FIGURE 32 


which is 1:1 and has a non-vanishing Jacobian: 
det F’(X) Ξ 0, XEU. 


Then F defines a new “coordinate system” on U. With each point X = 
(x,,...,X,) in U is associated a unique n-tuple Y = F(X) = ()1,..-5 Yn); 
where y, = f(x,,...,X,). In other words, if it is convenient we can de- 
scribe each point in U by giving its y coordinates. As X ranges over U, the 
possible n-tuples of y coordinates (),,...,y,) range over the open set 
V = F(U). The inverse map 
F-i 
U<«—V 

simply expresses the x coordinates as functions of the y coordinates: x, = 
g(y1,..-,Y,). AS in the previous example, it is quite common to introduce 
new coordinates on U by starting with the map 4 ', 1.6., by starting with 
X; = BAVin- + +5 Vn): 

The uses of different coordinate systems are not restricted to special 
ones of the type given in the two preceding examples. The proofs of certain 
general theorems are facilitated by the use of coordinate systems suited 
to the context. Here is a very simple illustration. 

Suppose that g is a function of class C' on a neighborhood of the 
origin in R? with g(0) = 0. Roughly speaking, the zero set of g should look 
like a surface which passes through the origin. That is a valid idea if we 
do not allow certain degenerate cases. Suppose that (Dg)(0) + 0, that is, 
suppose that the gradient of g does not vanish at the origin. Then, one can 
show that the set S = {X; g(X) = 0} is a genuine surface near the origin. 
There are other functions f which define the same surface. For example, 
let A be any function of class C! such that h(0) 4 0. Let f= gh. Then, in 
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a small neighborhood of the origin, fand g have the same zero set. Now 
ask this. Suppose we are given f, of class C' on a neighborhood of the 
Origin, and suppose that f vanishes on the part of S which lies in that 
neighborhood. Is f divisible by g; that is, can we write f = gh where his a 
smooth function near the origin? 

What does the inverse mapping theorem tell us about that question? 
It tells us that it suffices to worry about it in the case in which g is one of 
the coordinate functions x,;. Why? Since (Dg)(0) + 0, one of the partial 
derivatives of g is different from 0 at the origin. For convenience, suppose 
that (D,g)(0) ~ 0. Then we may use g, x,, and x; as coordinates for R3 
near the origin. The map 


F(X) = (g(X), X25 X3) 
is of class C! and has Jacobian matrix 


og og δὲ 
Ox, Ox, Ox; 
POG. ya ἢ 
0 0 l 


Thus det F’ = D,g and so det F’(0) 4 0. Choose U, a neighborhood of 
the origin which F maps 1:1 onto an open neighborhood of the origin. 
If f is of class C! on U and f vanishes on SM U = {X ε U; g(X) = 0}, 
then fo F~' is of class C! on V and vanishes on F(S) = {Y € V; y, = 0}. 
It should be clear that proving that g divides f is the same as proving that 
y, divides fo Ε΄ 1. In short, we need consider only the problem: /is of class 
(ΟἹ near the origin and f vanishes on the plane x, = 0; show that f= x,h 
where A is (at least) continuous. That is easy to show. 


In the last example, we mentioned a surface S in 443 which was defined 
by an equation g(x,, x, x3) = 0. Near the origin where (D,2)(0) ~ 0, 
we were able to use g, x,, and x; as coordinates. In particular, we could 
express x, as a function of g, x, and x; near the origin: x, = p(y1, x2, X3) 
= P(8(X1, X2, X3), X2, X3). The part of the surface S which passes through 
the origin can then be described by x, = p(0, x,, x3). In short, the relation 
2(X1, X2, X3) = 0 implicitly defines x, as a function of x, and x,; and that 
dependence can be made explicit near a point where (D,g)(X) τέ 0. This 
is a special case of the following basic result. 


Theorem 9 (Implicit Function Theorem). Let W be an open subset of 
R**™, let F be a continuously differentiable map from W into R™ and let 
(A; B) be a point of W such that F(A; B) = 0. Suppose that the m Χ πὶ 
matrix 
(8.52) M = [(Dx,;f)(A; B)] 


is non-singular. There exists an open neighborhood U of the point A and 
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an open neighborhood V of B such that if X © U, there exists a unique 
point G(X) € V such that F(X; G(X)) = 0. The map G which is thereby 
defined is continuously differentiable. 


Proof. Define a map 


F 
W > Rt" 


by F(X; Y) = (X; F(X, Y); FCX; Y)). Clearly F is continuously differ- 
entiable on W. The Jacobian matrix of F has the block form 


- I 0 
fe é 
[D) fil (Desi fil 


det F’(A; B) = det M 


where M is the matrix of (8.52). By hypothesis, det M + 0. We apply the 
inverse mapping theorem to the map F at the point (A; B). We obtain an 
open neighborhood of (A; B) on which F is 1: 1 with continuously differ- 
entiable inverse. We may assume that the neighborhood is of the form 
U x V, where U is a neighborhood of A and V is a neighborhood of B. 

Let’s look at the inverse map for F which is defined on F(U x V), 
an open neighborhood of F(A; B) = (A; 0). That map expresses (X; Y) 
as a function of (X; F(X, Y)). Let X € U. The point F-1(X;0) has the 
form (X; Y) for some Y ε V. Define G(X) = Y. 


In particular, 


The implicit function theorem can be formulated as a result about 
scalar-valued functions, and it is worthwhile to see what the result says 
in that form. We are given a point A = (a,,...,4,) in R* and a point 
B= (b,,...,5,,) in R”. Suppose that we have m functions f,,...,fn 
of class C! on a neighborhood of the point (4; B) in R**™. We are inter- 
ested in the set of points (X; Y) in R**™ which satisfy simultaneously the 
m relations 


TAX ee Vise la) Se 


(8.53) 


TA Xia ey eens I) = Ὁ. 


Suppose that (4; B) is such a point. The implicit function theorem states 
the following. If 


O(fi,--->Sm)\ (4. 
ina 

(vertical bars denote determinant), then there is an open set U about A 
and an open set V about B such that, for each X € U there is precisely 
one point Y € V for which the m relations (8.53) hold—and Y depends 
upon X in a continuously differentiable way. In other words, there are 
continuously differentiable functions g,,...,g,, on U such that on the 


Sec. 8.5 Inverse Mappings 


set U x V the equations (8.53) are equivalent to 
Vi = σι(χ,,..., Xx) 
(8.54) 


Yn a Sm X15 o0 0 9 X,): 
From the relations 


(8.55) TAX is aed es Biss hy) = Ὁ, l[<i<m 
which hold identically on U we can compute the partial derivatives of 
51... «Ὁ. 8m. Apply the chain rule, to differentiate with respect to x;,: 


(8.56) of “πα: GOD) + 3% § Of Six; Gx) $ 08, E(x) =0,  1<ixm, 


This is a San of linear ee for the ore 
O81. ren 98m 
Ox; OX; 


The solution can be obtained from Cramer’s rule because the coefficient 
matrix 


ὁ. fm 
Oy; OVm 


Ofm ... Om 
OY; OV m_ 


is non-singular at (X; G(X))—its determinant is 
O(f1,---» Sm) 
O(V1,--+ Vm) 
The solution by Cramer’s rule is 


(X; G(X)). 


pn ἀπ: eda 


The Jacobians on the right are evaluated at (X; G(X)), and 0()1,..., x; 
- 5 Ym) means replace y, by x;. 
Of course, all that the last equations say is this. If F = (f,,...,f,,) 
and G = (g,,..., 2m), then 


F(X; G(X)) = 0. 
This says that F o G = 0, where 


G 
U-—- Ux Vv 
G(X) = (X; G(X)). 
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Thus 
F'(X; G(X))G(X) = 0. 


Now the (k + m) x Καὶ Jacobian matrix G’(X) has the block form 


᾿Ξ I 
G(X) = | | 
G(X) 
where J is the k Χ k identity matrix. If we employ the shorthand 
τις, fm 
Ox, ὄχ, 
ΓΝ ᾿ 
dx | | , 
Om ... fm 
Ox, OX, 
δι, fm 
Oy; OVm 
dF _ . 
dY | | 
fm .., fm 
ὅν! OVm_ 


then the equation F’(X; G(X))G’(X) = 0 becomes 
aF 3s, dF iy, (ΧῚ — 
av: G(X)) + ay (Χ: G(X))G'(X) = 0. 


Thus 
(xy = (GEV aF 
Ce; (57) dX 
Sloppily speaking, along the set where F(X, Y) = 0, one has 
_ 4Ε 
OY ax ie 
a¥ = ΓΙ (remember this is nonsense). 
dY 


The situations in which one uses the implicit function theorem do not 
usually come packaged quite as neatly as what we have been describing. 
So, let us state the theorem another way. Suppose that P is a point in R’ 
and that F is a continuously differentiable map from a neighborhood of 
P into R”, where m <n. The Jacobian matrix F’(P) is an m Χ n matrix. 
Suppose that the rank of F’(P) is as Jarge as it can be: rank F’(P) = m. 


Then there are some m indices i; < --- <i,, such that 
οἱ Τὼ Ρ 
ὀίχενυνυς. Hi) Oe ; 


The implicit function theorem says the following. Look at S, the level set 
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of F through P 

S = {X; F(X) = F(P)}. 
Then that part of S which is near P can be described by expressing the 
coordinates x;,,...,X,, as certain continuously differentiable functions 


ἔπι 


of the remaining n — m coordinates. 


Exercises 


1. Let f(x) = x?, g(x) = x3. Answer the following for each function: 


(a) At which points x is the derivative invertible? 

(Ὁ) Is the function 1:1 on a neighborhood of 0? 

(c) Is the image an open subset of R!? 

(d) What can you say about the differentiability of the inverse function 
(wherever it is defined) ? 


2. Let f(x) = x4 sin (1/x), x 4 0, and f(0) = 0. 


(a) Show that fis of class C! on the real line. 


(b) Show that the range of fis open but f is not 1:1 on any neighborhood 
of 0. 


3. Prove that, if fis a real-valued function of class C! on the real line and f 
is locally 1: 1, then the image f(R) is open and fis 1: 1. 


4. Let f be a function of class C! on a neighborhood of the origin in R* and 
suppose that f(X) = 0 on the set [X; x, = 0}. Prove that f(X) = x,¢(X), 
where g is continuous on a neighborhood of the origin. 


5. Under the hypotheses of the inverse mapping theorem, prove that if F is of 
class C* then ΕΠ! is of class C*. 


6. Let F be the map from R? into R?: 
F(x, y) = (x, x? + y?). 
(a) What is the image of F? 


(b) What is the matrix F’(x, y)? 
(c) At which points (x, y) is F locally 1:1? 
7. Let F be the map from R? into R2: 
F(x, y) = (e* cos y, e* sin y). 
(a) Show that (DF)(x, y) is invertible at every point (x, y). 
(b) Show that F is an open mapping, i.e., that F(U) is open for every open 
set U. 
(c) Show that F is not a 1:1 map. 


(4) What is the image F(R2)? Is there an open set U which F maps 1: 1 
onto F(R2)? 


8. Refer to Exercise 7 of Section 8.4. Use the inverse mapping theorem to 
prove that, if 


f 
U—>C 
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is a complex-analytic function, then f-! (where defined) is a complex-analytic 
function. 


9. Consider the real-valued function 
f(x,y) = x* + y*? —5 


on R2. What does the implicit function theorem say in a neighborhood of (2, 1)? 
In a neighborhood of (,/ 5, 0)? 


10. Suppose everything is of class C! and the equations 
16, γ. Ζ) ΞΟ and K(x, », Ζ) = 0 
can be solved for y and z as functions of x. What is dy/dx? 


11. Assume the implicit function theorem and use it to prove the inverse map- 
ping theorem. (Hint: Consider G(X, Y) = X — F(Y).) 


8.6. Change of Variable 


We now turn to “change of variable” theorems, sometimes called 
substitution theorems in calculus. The 1-dimensional theorem says this. 
Suppose ἡ is an integrable function on the interval [a, δ] and we wish to 
compute 


[ ' h(x) dx. 


Suppose x = g(t), in this sense: We are given a smooth function g such 
that g maps [c, d] onto [a, b] with g’ > 0. Then 


b d 
f Mx) dx = [ Ἀια(θ) 5 Ὁ at. 
In higher dimensions the substitution will be given by a map G: 
1" —" R* 
xX, = σι(έ,,. e809 t,) 


Xn = Bnltis--- tn) 
To be precise, suppose we have 
G 
U—V 
where 


(i) U, V are open sets in R’; 

(ii) G 15 of class C'; 
(iii) Gis 1: 1 from U onto V; 
(iv) det G’(Y) ~ 0 throughout V. 


What we shall prove is this. Ifh ε L'(V), then (A o 6) € L'(U) and 


Ι Wy) dy = Ϊ , A(G(X)) |det G'(X)| aX. 
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In case ἢ is the characteristic function of a measurable set ΚΕ, the result 
will say 


m(E) = Ϊ [4εἰ G’(X)| dX. 
G'(E) 
Or, one can read it this way. If A is a measurable subset of U, then 


m(G(A)) = i _ldet Gx) | aX, 


Lemma 1. If E is a box in V, then 
(8.57) m(E) = | | det G’(X)| dX. 
G-'(E) 
Proof. Later. 
Lemma 2. Suppose Lemma 1 is valid for a particular map G. Then G"! 


maps sets of measure Q into sets of measure 0. 


Proof. Exercise. 


Lemma 3. Suppose Lemma 1 is valid for a particular map G. Then, for 
eachh &€ L!(V), the function (ἢ o G) | det G’ | is in L'(U) and 


(8.58) ΤΑ dY = [ _ (he G)(X) | det ΟΌΘ] dX. 


Proof. From Lemma 1 it is clear that (8.58) holds for step functions— 
finite linear combinations of characteristic functions of boxes. Given 
h € L'(V) we can find a sequence {h,} such that 


(i) each ἢ", is a step function; 
(ii) | |h—h,|—>0 
V 
(iii) lim A,(Y) = ACY), a.e. 
k- 00 


From (11) 
Jim [Ay — ἀμ] = 0. 
Now |, — A, | is also a step function. Apply (8.58) to those functions. We 
get 
lim Ϊ |h(G(X)) — h,(G(X))||det σ΄ (Δ) | dX = 0. 
i,k70 JU 
If we let 
U(X) = h{G(X)) | det G'(X)| 
then U, € L'(U) and 
lim |U(X) — U,(X)| dX = 0. 
j,koe JU 


Also 
lim U,(X) = h(G(X)) |det G(X)|, ae. 
ko 
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because of Lemma 2. Thus, (A ο G)|det G’| is in L'(U) and 


{ _WG(X)) [det G'(X)| dX = lim Ϊ , U(X) dX 
= lim { Iu) ΑΥ 


= { h(Y) ΑΥ̓͂. 
᾿ 
We are trying to prove this theorem. 


Theorem 10. Let U, V be open sets in Ἀπ. Suppose G is a map 
G 
U— V 
such that 
(i) G is of class C'; 

Gi) Gis1:1; 

(iii) det G’(X) 4 0, X € U. 
Ifh ε L'(V), then (ἢ o G)-det G’ is in L'(U) and 


ΟΊ dY = Ϊ _ h(G(X)) [det ΟὍΘΙ dX. 


We have noted several things about the proof. 


(a) It is true when ἢ = 1. 
(b) For any given G, the theorem follows once we can show that 


(8.59) m(E) = Ι τις [86 ΟἿ᾽] dx 


for every box Ec Κ΄. 
We also wish to observe that: 
(c) It suffices to verify (8.59) for boxes Ε and maps G such that 
detG’ ΞΟ on G'‘M(E) 
D,g, has constant signon ΟἿ᾽ 1(Ε). 


To see this, subdivide the open set V into boxes, using gridworks. 
Make sure that each box E is small enough that det G’ has constant sign 
on G~1(E). One of the derivatives D;g, must be different from zero at any 
point where det G’ ~ 0; hence we can choose the boxes E small enough 
that det G’ has constant sign on G~!(E) and that some one of the partial 
derivatives D,g, also has constant sign on that set. By reordering (some of 
the) coordinates, we may arrange that (c) is satisfied; hence it is the only 
case we need to consider. 

Now, we proceed to prove the last assertion when n > 1, by induction. 
We have the theorem for n = 1. We shall now show that, if the theorem 
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is true for ἢ — 1, then Lemma 1 is true for ἡ. So take a box E and a map 
G, as in (c). 
We define a map F_ 
F 
σ᾽ 1(Ε) ----» R" 
by 
F(x 1, “512 Xn) a (g:(%1; ΟΣ Χ,)» Χ2. 6.9 Xn) 

(See Figure 33.) We shall apply the theorem in dimension ἢ — 1, 

once to the map Go Ε΄! 

once to the map F™!. 


Of course, those are maps from Κα to R"; however, each is independent of 
one of the variables. 
Let’s look at ΕΓ 1: 


Εἰ 
5-- » σ΄ 1(Ε). 
Why is Fa 1:1] map, and how is Ε ' defined? Let’s see. If z € S, then 
FAC vee Ὁ Zz.) ΞΞΞ (χ!: Ζ25....} Ζ,) 
where 
Z1 = B1(%4, ον» Zn) 
and that makes sense, because, if we fix z,,...,z,, the condition “D,g, 


of constant sign” tells us that g, is 1:1 on the segment {x; x, = z,, 
ΠΞΞ Ιου (ἣν 


(Yo. 200 νη] 
ee re 
E 
yy 
F {2 )η) 


FIGURE 33 
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Now f is independent of x,,...,x,; 50 the I-dimensional theorem 
tells us something about F, or ΕΓ". It’s F~! that interests us. Suppose we 
fix z,,..., Z,. Look at the 1-dimensional map g: 


g(z,) = the first coordinate of f1(z,,..., Z,) 
== f, where σ᾽ Z2,...52Z,) = 24: 
We know that 


[πὸ αἱ = | Ag) |2’Olat. 


Now 
DF"), DCF"); αν DCEO), 
0 1 ἢ «τὶ 0 
ἘΝ 0 0 Ι .-.- 0 
(Ε 1) ) = 
_ 0 0 0 ] i 
So that 


det (F-')'(z) = g'(21). 
Enough of that, for the moment. Let’s look at the map G o Ε΄! 


(Go F-')(z,,...52,) = (Vin... aS) 
where 


Vi 44 
Ψ':ΞΞ 5(χι. Ζ25......2 Zz.) 
and x, is determined by 
σιίχι; Ζ25....} Zn) — 24: 
So Go Ε΄! leaves the first coordinate fixed. That is the point of defining 
F, namely, that G o F~! is like a map on π΄ 1, 
Specifically, suppose we fix z,. The set 
S,, = {z Ε S32,(z) = 24} 
is mapped by G o ΕΓ! onto the set 
Ey. = f{y Gb y= Fy. 


That means that, by the (n — 1)-dimensional theorem, we can compute 
m(E,,) = { __ [det Hi,(22,.. +5 2,}} dza +++ dz, 


where H,, is the map 
H;, 
Ss cers E., 


Ἡ (2Ζοιν ἐν ee) HIG ade 2) SG OG (5 εξ.) 
that is, H = Go F~!. But, how do we compute H7,,? That’s easy. 
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H =(hy,...,h,) 
hy(24,...32,) = Ζι 
l 0 tat ie 0 
Ο Dh, --- Doh 
ΗΚ -- 
0 Dh, --- D,h, 
l 0 Ὄπ 0 
0 


η(2 --ἰ - 
ὑπ a: ees 


0 
So det Hi(z,.,...,Z,) = det H'(Z). Thus, 
m(E,,) = [ ᾿ det Η 2 ἀξ, +++ ἀξ, 


Now 
m(E) = Ϊ m(E,,) dz; 
= | az, Ie det H"(z) dz, --- dz, 
= { det H’ dz, - - - dz,. 
S 
Since 
H — G ο F-}1 
H"(Z) = GF" (z))(F-')(Z). 
Thus, 


m(E) = [ _det σΈΡ΄ (2) det (δ΄ 92 dz. 


Apply the 1-dimensional theorem to F~! in a manner analogous to the 
application just made to G o F~!. We obtain 


m(E) = i det G’(X) dX. 
G4M(E) 


Corollary. If L is a linear transformation from R® into R® then, for 
every measurable set S in ἘΝ, the image set L(S) is measurable and its mea- 
sure is m(L(S)) = | det L| m(S). 

Proof. If L is non-singular, apply the theorem with G = L. Then 
det σ΄ has the constant value det L. If L is singular, the image L(R") is a 
proper subspace of R" and hence has measure zero. So the result is trivially 
true. 


41] 
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The corollary is much more elementary than the change of variables 
theorem. It can be proved directly in a variety of ways. Some proofs of 
Theorem 10 proceed by verifying the corollary first and then using it to 
(help) prove the more general result. 


EXAMPLE 7. An elementary example with which the reader is probably 
familiar from calculus deals with polar coordinates. Let 
U = {(r, 8); r>0,0< @ < 27}. 
G(r, 0) = (r cos 9, r sin @). 
Then G is 1: 1 and maps U onto 
V = ΚΑ — {(x, 0); x > 0). 
We have 
G'(r, 0) = 


det G'(r, 0) = τ. 


Since R2 — V is a set of measure zero in the plane, L'(V) = L'(R?). 
Theorem 10 states that if f is an integrable function on R?, then 
f(r cos 0, r sin 0) is integrable over U and 


cos@ —rsin : 


sin θ rcos@ 


| f= [Κρ c0s 6, r sin @)r dr ad 
R? U 


ΞΞ in {Ὁ cos 9, r sin @)r dr 4θ. 
0 0 


(a) Let 
σι ἘΝ 21 γ 
Then fis a measurable function on R?, and since 


| f(r cos 8, r sin 8)| =< 


we see that fis integrable on each bounded subset of the plane: 


Ra ea 


== iho. = Cos 


(b) The reader should be (or become) familiar with the calculation of 
[ ο΄ χ᾽ dx. 


Obviously f(x) = εὖ is integrable on R! because e~** « 6“ 5] for large 
x. Note that (since f > 0) 
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Thus fej ae: 


Exercises 


pt ἡ 
dx ὦ 
J, τ δα 


by a suitable change of variables. Justify each step, using the Fubini theorem or 
the change of variable theorem. 


1. Evaluate 


2. Prove Lemma 2 of this section. 


3. From spherical coordinates (r, ¢, 9) in R3 we obtain a formula 
mE) = [. f(r, φ, 9) dr db dé 


for the measure (volume) of a set E. Find f(r, 6, 8) explicitly. 
4. What is the measure of the ball 
BO;r) ={X ε R*;|X| <r}? 


1 fi pi ἢ 
--ς.--------- τ τς dx dy dz. 
1....ττῦττν 


#6, Let G be a map of class C!, from an open set U < R" into R*, and let S 
= {X ε U; det G(X) = 0}. Prove that the (image) set G(S) is a set of measure 
zero. 


5. Find 
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Elementary 


Set Theory 


A.l. Sets and Functions 


We describe some basic terminology. We shall talk frequently about 
sets. We make no attempt to define what a set is, except to say that a set is 
any collection of objects. A set will be called a collection, class, or family 
on various occasions. The objects in the set S are called the members, ele- 
ments, or points of S. If x is a member of S, we write x € S. Most often, 
a set is specified by giving a rule for membership in the set. A convenient 
piece of shorthand for presenting such a rule is the notation {x; . . .}, to be 
read “the set of all x such that... .” 

The concept of set is simple and basic. Experience indicates that large 
numbers of people comprehend readily what the abstract idea is. But, 
indiscriminate use of set terminology does have some problems and pitfalls 
which are not immediately apparent. We shall try to avoid such difficulties 
by not venturing into the mathematical contexts in which they arise. If the 
reader meets a set and feels that we have not given a well-defined rule by 
which we can determine for each x whether or not x is a member of the set, 
then he or she should feel free to inquire or complain about it. 

The set S is a subset of the set 7 if each member of Sis a member of 7. 
If S is a subset of 7, we write S c 7. If S -- Tand S *} T, then S is called 
a proper subset of 7. Note that every set is a subset of itself. The empty set 
is the set with no members. It will be denoted by @. It isa subset of every 
set. 

The union of the sets S and 715 the set S U 7, which consists of the ob- 
jects which are either in S or in T: 


SUT={x;x € Sorx €T}. 
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The intersection of S and T is the set S (ἡ 7, which consists of the objects 
which are both in S and in 7: 


SO T= {x;x € Sand x ε T}. 


We always have 
SOATeSUT 


1.6., “or” does not preclude the possibility of “and”. The sets S, T are called 
disjoint (from one another) if their intersection is the empty set: 


SOT= Ὡς: 
The complement of T relative to S is 
S—T={x Ee S;x € T}. 


Frequently, one works in a context in which all sets of interest are subsets 
of one “universal” set S. In such a case, the complement of T relative to S 
is referred to as (simply) the complement of T. 

The Cartesian product of S and 7 is the set S Χ 7, which consists of 
all (ordered) pairs (s, 2) with s in S and ¢ in T: 


Sx T= {(s,0);5 € S,t €T}. 


Probably the most important concept in mathematics is that of “func- 
tion”. A function consists of 


(i) a pair of sets D, Y; 

(ii) a rule 3, which associates with each object x in D an object 
F(x) in Y. 
We usually describe the above in a slightly imprecise way by saying that f 
is a function from D into Y and by writing 


f 
D—— Y. 

We call D the domain (of definition) of fand Y the range of f. A function is 

also called a transformation, map, mapping, or operator. 

What we have given is not really a definition in the strict sense, be- 
cause we used the word “rule” in the so-called definition; and what is a 
tule? So, we try to further clarify the concept of function by discussing 
graphs of functions. The graph of fis a subset of the product set D x Y, 
namely the subset 


G = {(x, f(x); x ε D}. 
The Cartesian product D x Y (usually) has very many subsets, very few of 
which are graphs of functions from D into Y. If G is a subset of D x Y, 
when is G the graph of a function from D into Y? Answer: If and only if, 
for each x in ἢ), there is precisely one y in Y such that (x, y) Ε G. Some 
people prefer to give this subset characterization as the definition of a func- 
tion. 
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Let f be a function 
f 
D—» Y. 
If Tis a subset of Y, the inverse image of T under 715 


fT) = {x ε D; f(x) € Th. 


A level set of fis the inverse image of a point y € Y: 


{x3 f(x) = Y}. - 
(Actually, we only call this a level set if it is non-empty.) A function is 
constant if it has precisely one level set. 
If A is a subset of D, the set of values which are assumed by fon A is 
called the image of A under /: 


f(A) ={v © Y; ¥=f(x) forsome χε 4}. 
The image set f(D), i.e., the set of all values assumed by /f, is called the 
image of αὶ If the image of fis (all of) Y, we say that fis a function from ἢ 


onto Y (or sometimes, “fis onto”). 
The function f is called 1: 1 (read “one-to-one”) if 


I (1) FF (X2); X, FX. 

In other words, fis 1:1 provided f(x,) = f(x,) implies x, = x,. Suppose 
that fis 1:1 and onto. There is then defined an inverse function 

f7} 

D<«— Y 

as follows: f~1(¥) is the (unique) point x € D such that y = f(x). A 1:1 
function from D onto Y is also called a 1: 1 correspondence between the 
members of D and the members of Y. 


If 
f g 
D— Υ--  Ζ 
there is defined the composition 
gof 
D—>Z 


by 
(g of )(x) = a(f(x)). 
For example, if fis 1: 1 and onto we have 


“ἴ 


f 
D-> Y—> ὃ 
and the composition f~!of is the identity function J on ἢ: 
I(x) = x. 


Similarly, fof~! is the identity function on Y. 
If Y is a set, a sequence (of elements) in Y is a function from the set of 
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positive integers 
yee Ὁ a eee 


into the set Y. Usually a sequence in Y will be indicated as {y,}, meaning 
the function 


Z,—Y 
for which f(n) = »,. 
We shall sometimes deal with sequences of sets. If {S,} is a sequence 
‘of sets, we have the union and intersection 


(JS, = {x;x ε S, for some n} 
()S, = {x;x Ε S, for every n}. 


We will also deal with families of sets which are indexed by sets other than 
the positive integers. If we talk about the family of sets {S,; ὦ Ε A}, that 
means that associated with each a ΕΞ A we have a set S,. The notations 


US, ()Sz 


should have meanings which are clear. 
At one point, we shall make use of the axiom of choice which says the 
following. If {S,;«@ © A}is acollection of non-empty non-overlapping sets 


S.A D 
A ΓῚ Sp -- Oz a = B 
then there exists a set which consists of exactly one element from each S,. 
(Loosely, there is a way to choose one element from each set.) This is called 
an axiom (assumption), for the following reason. What is asserted is that, 
given {S,; ὦ Ε A}, there is a rule (function) which selects an element x, Ε 


5... But, what exactly is the rule? Since we can’t say, we just assume that 
there is one. 
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The question of how many members a set has becomes much more 
complicated for infinite sets than it is for finite sets. One tends to think that 
the set of positive integers {1, 2, 3, . . .} has more members in it than does 
the set of even positive integers {2, 4, 6, . . .}. On the other hand, if we list 
One set under the other thusly: 


1 2 3 4 
24 6 8 


it is difficult to convince anyone that the first set has more members than 
the second. The lesson to be learned from this is the following: If we deal 
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with infinite sets and wish to discuss whether or not one set has more mem- 
bers than another, at least one of our intuitive ideas about “more than” 
will have to fall by the wayside. It turns out that the thing to scrap is the 
intuitive idea which suggests that {1, 2,3,...} has more members than 
does {2, 4, 6, . . .}, namely, the idea that, if S is a proper subset of 7, then 
T has more members. In particular, face the fact that there are just as many 
even integers as there are integers. 

We are not going to give anything resembling a thorough discussion 
of the circle of ideas involved with the number of elements in a set. But we 
need to know just a bit about the question of when two sets have the same 
number of elements. We say that S and T have the same cardinality if there 
exists a 1: 1 correspondence between the members of S and the members 
of T. Thus S and T have the same cardinality if there exists a function 

f 
5-» 7 
which is 1: 1 with f(S) = T. 

The most basic fact about cardinality is this. Suppose that S and T 
are sets such that there exists a 1: 1 function from S into J, and also 
there exists a 1: 1 function from T into S; then S and T have the same 
cardinality. Intuitively, if 7 has at least as many elements as does S and 
vice versa, then they have the same number of elements. We shall not stop 
to prove this fact. (It’s a non-trivial mental exercise.) We shall content 
ourselves with some special cases. 

A set S is finite if (it is empty or) there exists a positive integer n such 
that S has the same cardinality as {1, 2, 3,...,m}. A set is infinite if it is 
not finite. Every infinite set exhibits the phenomenon we saw with the 
integers and even integers, that is, every infinite set can be put in 1: 1 cor- 
respondence with a proper subset of itself. The set S is countable if it is 
finite or has the same cardinality as the set of positive integers. The term 
countable stems from the following. 


Lemma. The non-empty set S is countable if and only if there exists a 
function from the set of positive integers onto S, i.e., if and only if there is a 
sequence {x,} such that | 


(1) for each ἢ, x, € 5; 
(11) ifx © 5, then x = x, for some n. 


Proof. If S is countable but not finite, we have a sequence {x,} such 
that, if x Ε S, then x = x, for precisely one ἡ. If S is finite then S = {x,, 
X,,...,X,}. Define x, = x, fork > n and {x,} is the sequence we want. 


50, a non-empty countable set is one such that there is a listing x,, x., 
x;,... of its elements; i.e., a nonempty countable set is the image of a 
sequence. 
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Theorem 4.1. Every subset of a countable set is countable. 


Proof. Suppose S is countable and T is a subset of S. Suppose T is 
non-empty. Then S is non-empty, and we have a sequence {x,,} with S as its 
image: 

S =  ἰχ πὲ Z,}. 
Let n, be the smallest positive integer k such that x, Ε 7. Let n, be the 
smallest positive integer k such that k > n, and x, © 7. Continue by in- 
duction, to obtain the sequence of positive integers n, <n, <n, <---> 
such that 7, is the least positive integer k such that k > n,_, and x, € T. 
Let 


t; Ξ- Xnj 


and it is obvious that T is the image of the sequence {t,}. 


Corollary. The set S is countable if and only if S has the same cardi- 
nality as some subset of the positive integers. 


Theorem A.2. Suppose that {S,} is a sequence of countable sets. Then 
the union 


US, 
is a countable set. 


Proof. Obviously we may throw out any S,’s which are empty. Thus 
we have sequences which list the members of the sets S,: 


S,= {x23 ΚΕ Z;}. 


We then list the elements of the union by the following scheme: 


Corollary. The set of all rational numbers is countable. 


Proof. A rational number has the form m/n, where m is an integer and 
n is a positive integer. For each n, let S, be the set of all numbers m/n, 
where n is an integer. The set of integers is countable by Theorem Α.2, 
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or by the scheme 
lL, 2 3 4 5 
0 1 - 2 -2 


So the union of the sets S, 1s countable. 


The reader should be very familiar (henceforth) with the essence of 
the last corollary, which is the fact that the set of positive rational num- 
bers is countable via the scheme: 


Theorem A.3. The set of real numbers (points on a line) is uncountable 
(not countable). 


Proof. It will suffice to show that the set of real numbers x,0< x <1, 

is not countable. Each x in that interval has a decimal representation 
X = .«ἀγαχᾶ," .ς 0<a,< 9. 
The only ambiguity in the representation is caused by an infinite string of 
repeating 9’s. The decimal 
.a,a,...a,999... 
represents the same number as does 
.4,a,.. (a, + 1)000.... 

That is a minor technicality. 

Suppose that we have a sequence of points x, in the intervalO << x < 
1. We shall show that the image of the sequence {x,} cannot exhaust the 


interval. (Hence, the interval is not a countable set.) Given the sequence 
{x,}, look at the decimal representations: 


xX, = -4;141,2Q;3... 
Ny = .€421422423...- 
X3 = .431452Q33..., CC. 


We shall exhibit a number x, 0 < x < 1, which is not in the list x,, x2, 
X3,.... We describe x by its decimal representation. For each n, choose an 


Α.2 Cardinality 


integer a,,0 <a, <9, such that a, ~ a,,. Let 
x= -A,a,Q3. ares 


Then x ~ x,, because the nth digit of x is different from the nth digit of x,,. 
Of course, there is a little confusion caused by the fact that different digits 
do not imply different numbers (repeating 9’s). Therefore, we should 
choose the digits a, so that 0 < a, < 9 and|a, — a,,| > 2. Then x τέ x, 
for every n, for certain. We have produced an x, 0 < x < 1, which is not 
in the list x,, X., X3,....- Conclusion: No such list can exhaust the points 
in that interval. 


Thus, although the set of rational numbers has the same cardinality 
as the set of positive integers, the set of real numbers does not (i.e., it has 
a larger cardinality). One may reasonably ask: Does there exist a subset S 
of the set of real numbers which is uncountable yet does not have the same 
cardinality as the set of all real numbers? That turns out to be a very 
sophisticated question which plagued mathematical logicians for many 
years. It was settled completely only a few years ago. The answer in im- 
precise form is: You will not prove that such a set exists nor will you prove 
that no such set exists—unless mathematics as we know it 15 logically 
inconsistent. 
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m*(S), 305 
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Ree 18 
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S, 63 

S(F, P, T), 129, 154 


{..| F, 131, 154 
I B 
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Bound: 
greatest lower, 8 
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Boundary, 63 
Bounded: 
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convergence theorem, 331 
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sequence, 37 
sequence of functions, 171 
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Bounded (cont.) 
variation, 161, 245, 262 
Box, 153 
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Cantor: 
function, 115, 127, 165 
set, 59, 76, 114, 143, 347 
Carathéodory criterion, 349 
Cardinality, 418 
Cartesian product, 62, 71, 354, 358, 
415 
Cauchy: 
convergence criterion, 39 
inequality, 17, 23 
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Riemann equations, 202 
Schwarz inequality, 136, 234, 247 
sequence, 39, 116 
theorem, 212 
Cesaro means, 227, 236 
Chain rule, 122, 388 
Change of variable theorem, 142, 408 
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function, 297 
values, 65 
Class, 414 
C*, 120, 150, 373 
C~, 120, 150, 373 
Closed: 
ball, 31 
box, 153 
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interval, 11 
relative to, 72 
set, 58, 254 
Closure, 63, 254 
Cluster point, 57, 254 
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sequentially, 53, 60, 276 
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Complement, 415 
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of functions, 416 
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Congruence, 115, 285 
Conjugate, 20 

transpose, 23 
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set, 74, 101, 254 
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function, 77, 254 

piecewise, 139, 155, 245, 286 

at a point, 73, 254 

uniformly, 108, 255 

weakly, 297, 298, 328, 342 
Contraction, 265, 273 
Convergence: 

absolute, 46, 256 

bounded, 331 

dominated, 331 

of Fourier series, 222, 225 

of functions (pointwise), 169 

monotone, 37, 175, 328 

of power series, 185, 192, 197 

of sequences, 32, 254 

of series, 35 

uniform, 170, 175, 178, 180-182, 209, 

257 

Convex: 

combination, 91 

function, 85, 124, 127, 149, 167 

hull, 167 

set, 66, 76, 88, 261 

strictly, 253 
Convolution, 234, 356, 359 
Coordinate: 

functions, 29 
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Coordinate (cont.) 
system, 400 
Coordinates: 
relative to a basis, 26 
spherical, 399 
standard, 16 
Correspondence, 416 
Cosine function, 22, 124, 177, 199 
Countable: 
additivity, 298, 342 
cover, 67 
set, 418 
subcover, 67 
Cover (covering), 67 


D 


Decimal representation, 12 passim, 35 
Decreasing: 
function, 102, 147 
sequence, 38, 42 
Delta function, 235 
Dense subset, 65, 254 
Dependent vectors, 24 
Derivative, 119, 380, 384 
complex, 198 
directional, 149, 374 
left-hand (right-hand), 120, 121 
partial, 150, 158, 372 
Determinant, 51, 83, 86 
Diameter, 41 
Differentiable function, 119, 182, 379 
complex, 198 
composition of, 122 
map, 384 
nowhere, 272 
Differential equations, 124, 128, 267 
Dimension, 25, 255 
Dini’s theorem, 175 
Directional derivative, 149, 374 
Dirichlet: 
Hardy theorem, 178 
kernel, 225 
problem, 237 
Disconnected set, 74, 91 
Disjoint sets, 415 
Disk, 31 
Distance: 
between points, 16, 248 
from a point to a set, 86, 275 
Divergent: 
sequence, 32 
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Divergent (cont.) 
series, 43 
Domain, 415 
Dominated convergence theorem, 331 
Dot product, 16 
Double: 
limit theorem, 180 
sequence, 179 
series, 181, 183 


E 


Empty set, 11, 414 
Equicontinuous (uniformly), 279 
Euclidean space, 16, 243 
Expansion (see Decimal representation, 
Fourier series, and Power series) 
Exponential: 
function, 48, 50, 96, 99, 103, 104, 
106, 114, 123, 127, 143, 168, 
172, 199, 204, 219 
of matrix, 48 
Extended real number system, 97 
Extension of continuous map, 111, 256, 
295 
Extreme point, 91 


F 


Fast Cauchy sequence, 49, 263 
Fatou’s lemma, 329 
Field, 1, 19 
axioms, 1 
Finite: 
cover, 67 
dimensional, 255, 276 
intersection property, 69 
set, 418 
subcover, 67 
Fixed point: 
of continuous function, 107 
of a contraction, 265 
Fixed point theorem, 265 
Flat subset, 24, 30, 88 
Fourier series, 222, 366 
coefficients, 222, 366 
convergence, 222, 225 
partial sums, 222 
Fourier transform, 335 
Fubini’s theorem, 351, 355 
Function, 414—415 
affine, 126, 375 


Function (cont.) 
almost periodic, 284 
bounded, 107, 109 
of bounded variation, 161, 245 
of class C*, 120, 150, 373 
complex-analytic, 198 
continuous, 77, 254 
convex, 85, 124, 127, 149, 167 
coordinate, 29 
decreasing, 102, 147 
differentiable, 119, 182, 379 
harmonic, 203, 236 
increasing, 97, 102, 122, 147, 149 
Lebesgue-integrable, 315, 344 
linear, 29, 86, 292, 371, 383 
measurable, 336, 344 
null, 323 
periodic, 284 
piecewise continuous, 139, 155, 245, 

286 

real-analytic, 184, 194, 375 
Riemann-integrable, 130, 154 
simple, 349 

Fundamental theorem: 
of algebra, 215 
of calculus, 140, 335 


G 


Geometric: 

mean, 168 

series, 3, 35, 42, 44, 184, 193 
Gradient, 151, 374 


Graph, 66, 107, 117, 125, 127, 348, 358, ᾿ 


415 
Greatest integer function, 12, 36, 84 
Greatest lower bound, 8 


H 


Half-space, 88 
Harmonic: 
functions, 203, 236 
series, 43 
Harnack’s principle, 240 
Heine-Borel theorem, 70 
Hilbert space, 259, 361 
Hyperplane, 29, 76, 89, 378 


I 


Identity: 
function, 416 
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Identity (cont.) 
matrix, 26 
theorem, 199 
Image, 416 
Imaginary part, 20, 34 
Implicit function theorem, 401 
Increasing: 
function, 97, 102, 122, 147, 149 
sequence, 37, 42 
Independent vectors, 24 
Induction, 3 
Inequality: 
Bessel, 364 
Cauchy, 17, 23 
Cauchy-Schwarz, 136, 234, 247 
Jensen, 167 
Infimum (inf), 8, 36, 99 


Infinite: 
limits, 97 
matrix, 179 


product, 50 
series (see Series) 
set, 418 
Infinity, 12, 97 
Inner measure, 348 
Inner product, 246 
space, 247, 295 
standard, 16, 22, 23 
Integer, 3 
Integrable: 
in the sense of Lebesgue, 315, 344 
in the sense of Riemann, 130, 154, 
332 
Integral: 
over box, 154 
over interval, 131 
iterated, 155 
lower (upper), 145 
over measurable set, 344 
over R", 302, 318 
Riemann-Stieltjes, 163 
Integration: 
by parts, 164 
Riemann, 130, 154, 332 
Riemann-Stieltjes, 163 
Interior, 63, 254 
Intermediate value property, 184 
Intersection, 415 
Interval(s), 11, 15, 75, 101 
closed, 11 
nested, 14 
open, 11, 56, 62, 71 


Index 


Inverse: 
function, 102, 116, 122, 218, 416 
image, 416 
mapping theorem, 395, 397 
matrix, 26, 37, 57, 65, 76, 82, 86, 95, 
358, 387 
Inversion, 48, 57, 82, 95, 387 
Invertible (see Inverse) 
Irrational number, 11, 107 
Iterated integrals theorem, 155, 350 


J 


Jacobian, 397 
matrix, 384 
Jensen’s inequality, 167 


L 


L'-norm, 308, 322 
Laplace’s equation, 203, 236 
Least upper bound, 7 
Lebesgue: 
-integrable, 315, 344 
integral, 318 
ladder, 349 
Left-hand: 
derivative, 121 
limit, 97 
Legendre polynomials, 363 
Length, 16, 85, 243 
of a curve, 162 
Level set, 86, 416 
Limit: 
of function, 92 
inferior (lim inf), 38, 42, 55, 97 
infinite, 97 
at infinity, 97 
left (right), 97 
of sequence, 32, 254 
superior (lim sup), 38, 42, 55, 97 
Line, 24 
segment, 66 
Linear: 
combination, 24 
dependence, 24 
equations, 29 
function(al), 29, 86, 292, 371, 383 
independence, 24 
space, 241 
subspace, 24, 242 
transformation, 83, 113, 117, 370, 
392, 411 
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Liouville’s theorem, 213, 239 
Lipschitz: 
condition, 162, 246 
norm, 246 
Local maximum (minimum), 121, 
152 
Logarithm, 105, 106, 123, 191, 204, 
213, 219, 240 
Lower: 
bound, 6 
integral, 145 
Riemann integral, 145 


M 


Map (mapping) (see also Function) 
continuous, 77, 254 
differentiable, 384 
open, 218, 221 
Matrix, 17, 179 
addition, 18 
conjugate transpose, 23 
exponential of, 48 
inverse, 26, 37, 57, 65, 76, 82, 86, 95, 
358, 387 
Jacobian, 384 
multiplication (product), 18, 81, 120 
orthogonal, 28, 62, 71, 86, 107, 115 
representing, 117, 371 
skew-symmetric, 107 
transpose, 19, 23 
Maximum principle: 
strong, 215, 239 
weak, 214, 237 
Mean value: 
property, 205, 236, 239 
theorem, 121, 135, 159, 167, 391 
Measurable: 
function, 336, 344 
set, 340, 347 
Measure: 
of ball, 347, 413 
of box, 154, 313 
inner, 348 
of open set, 340 
outer, 305 
of set, 341 
zero, 308 
Mesh, 129, 153 
Metric space, 260 
Modulus of continuity, 109, 135, 154, 
163 
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Monotone: 
convergence theorem, 37, 175, 328 
decreasing (increasing) sequence, 37, 
42 
Moore-Osgood double limit theorem, 
180 
Multiplication: 
of complex numbers, 20 
of matrices, 18, 81, 120 
of real numbers, 2 
scalar, 16, 22, 242 
Multiplicity (order), 216 


N 


Neighborhood, 31, 254 
of +~, 98 
relative to, 72 
Nested: 
intervals, 14 
sets, 41, 61, 269 
Norm, 243 
L*, 303, 322 
of linear transformation, 113, 245, 
262 
of matrix, 18, 23 
quotient, 289 
on Κ΄, 16, 243, 249 
sup, 171, 244, 369 
Normed linear space, 243 
Null function, 323 
Number system: 
complex, 19 
extended real, 97 
rational, 3, 4, 8, 15, 41 
real, 1 


O 


Onto, 416 
Open: 
ball, 31, 254 
cover, 67 
disk, 31 
interval, 11, 56, 62, 71 
mapping theorem, 218, 221 
relative to, 72 
set, 55, 58, 254, 314, 340 
Order: 
axioms, 2 
basic results on, 8 
Orthogonal: 
complement, 28, 291 


Index 


Orthogonal (cont.) 
functions, 362 
matrix, 28, 62, 71, 86, 107, 115 
projection, 28, 36, 62, 73 
set, 27 
transformation, 115 
vectors, 27 

Orthonormal: 
basis, 27 
sequence of functions, 362 
set, 27 

Outer measure, 305 


P 


Parallelogram law, 247 

Partial: 
derivative, 150, 158, 372 
sums, 35, 222 

Partition, 129, 153 

Path, 162 

Perfect set, 63 

Periodic function, 284 


Piecewise: 
continuous function, 139, 155, 245, 
286 
monotone function, 149 
Plane, 24 


Point of accumulation, 51 
Pointwise convergence, 169 
Poisson kernel, 230 
Polar coordinates, 22, 398 
Polynomial, 81, 159, 177, 199, 213, 215, 
220, 232, 240, 257, 261, 272, 
358, 363, 375 
Positive: 
integer, 3, 7, 8 
real number, 6 
Power series: 
complex, 197 
convergence, 185, 192, 197 
multiple, 192 
real, 184 
Product: 
Cartesian, 62, 71, 354, 358, 415 
dot, 16 
infinite, 50 
inner, 16, 22, 23, 246, 295 
matrix, 18, 81, 120 
Projection: 
onto a convex set, 89 
orthogonal, 28, 36, 62, 73 


Index 


Proper: 
map, 107 
subset, 414 


Q 


Quotient: 
map, 288 
norm, 289 
space, 288 


R 


Radius of convergence, 185, 194, 197 
Range, 415 
Rational: 
function, 82 
number, 3, 4, 8, 15, 41 
Ratio test, 49 
Real: 
analytic function, 184, 194, 375 
linear space, 242 
number system, 1 
part, 20, 34 
Rearrangement of series, 47, 51 
Rectifiable path, 162 
Refinement, 133, 153 
Relative neighborhood (closed, open 
set), 72, 254 
Riemann-integrable, 130, 154, 332 
Riemann integral: 
over box, 154 
over interval, 131 
over R*, 302 
Riemann-Lebesgue lemma, 223 
Riemann-Stieltjes integral, 163 
Riemann sums, 129 
lower (upper), 144 
Riesz-Fischer theorem, 361 
Right-hand: 
derivative, 120 
limit, 97 
Rigid motion, 115, 348 
Roots (nth), 9, 83, 103 
Root test, 49 
Rotation-invariant, 347 
Rouche’s theorem, 221 


5 


Scalar multiplication, 16, 22, 242 
Schwarz lemma, 219 
Semi-closed interval, 11 
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Semi-norm, 243 
Semi-normed linear space, 243 
Separation: 
of closed sets, 75, 87 
of convex sets, 91 
Sequence, 416 
bounded, 37 
Cauchy, 39 
convergent, 32, 254 
decreasing (increasing), 37, 42 
divergent, 32 
double, 179 
fast Cauchy, 49, 263 
limit of, 32, 254 
monotone, 37 
Sequential compactness, 53, 60, 276 
Series: 
absolutely convergent, 46 
alternating, 49 
convergence tests, 44, 49 
convergent, 35 
divergent, 43 
double, 181, 183 
Fourier, 222, 366 
geometric, 3, 35, 42, 44, 184, 193 
harmonic, 43 
infinite, 35 
positive, 42 
power, 184, 192, 197 
sum of, 35 
Set, 414 
of measure zero, 308 
Sigma-algebra, 348 
Simple function, 349 
Sine function, 22, 124, 199 
Skew-symmetric matrix, 107 
Space: 
Banach, 256 
Euclidean, 16, 243 
inner product, 247, 295 
linear, 241 
normed linear, 243 
quotient, 288 
Sphere, 63 
Spherical coordinates, 399 
Square root, 4, 6, 9, 36, 149, 175, 266, 
335, 392 
Standard: 
basis, 25 
coordinates, 16 
inner product, 16, 22, 23 
representing matrix, 117, 371 


432 


Step function, 223 
Stieltjes integral, 163 
Strictly: 
convex, 253 
decreasing (increasing), 102 
Subcover, 67 
Subfield, 20 
Subsequence, 52 
Subsets, 414 
addition of, 71, 107, 284, 292, 348 
complete, 263, 269 
flat, 24, 30, 88 
Subspace, 24, 63, 73, 76, 86, 242, 348 
dimension of, 25 
finite-dimensional, 255, 263, 276 
spanned by, 24 
Sum (see also Addition) 
of series, 35 
Support, 301, 305 
compact, 301 
Supremum (sup), 8, 36, 99 
norm, 171, 244, 369 
System of linear equations, 29 


T 


Taylor’s theorem, 189, 195, 393 
Ternary representation, 60 
Trace, 19, 51 
Transformation: 
affine, 113, 372 
linear, 83, 113, 117, 370, 392, 411 
orthogonal, 115 
Translate, 62, 159, 235, 284 
Translation, 115 
invariance, 347 
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Transpose, 19 
conjugate, 23 


U 


Uncountable, 420 
Uniformly: 
continuous, 108, 255 
convergent, 170, 175, 178, 180-182, 
209, 257 
equicontinuous, 279 
Union, 414 
Unit: 
ball, 250, 276 
circle, 21 
Upper: 
bound, 6 
integral, 145 
Riemann integral, 145 


Vv 


Vanishes at infinity, 116 
Variation: 
bounded, 161, 245, 262 
of function, 161 
total, 161 
Vector: 
addition, 16, 17, 22, 80, 241 
space (see Linear, space) 


Ww 


Weakly continuous, 297, 298, 328, 342 
Weierstrass: 
approximation theorem, 232, 257 
M-test, 174 


(continued from front flap) 

e Employs spaces and sets of matrices 
as vehicles for providing richer ex- 
amples relating to open sets, closed 
sets, continuity, and other topics. 


e Approaches basic concepts such as 
convergence, continuity, and com- 
pactness first in terms of Euclidean 
space and later in the context of 
(subsets of) normed linear spaces. 
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Analysis in Euclidean Space introduces the basic concepts, principles, and 
methods associated with mathematical analysis in a significant departure from 
conventional books on the same subject. This difference is manifested in 
two ways. First, the introductions to real and complex analysis are treated 
together in a single volume, thereby facilitating rapid transition into complex 
function theory. Second, generalizations beyond R® are presented for sub- 
sets of normed linear spaces, rather than for metric spaces, preparing a 
familiar foundation in Euclidean space that can be extended quite easily to 
normed linear spaces. | 


The eight detailed chapters of Analysis in Euclidean Space are grouped into 
two major areas of coverage. The first five chapters, for example, concern 
the important “four C’s” of completeness, convergence, compactness, and 
continuity. More specifically, they present a review of numbers and geometry 
and cover convergent sequences and infinite series, open, closed and compact 
sets, the limit of a function, continuous mappings, uniform continuity, selected 
topics in calculus, Fourier series, power series, and analytic functions. 


The remaining chapters employ the results from earlier discussions, empha- 
sizing linear spaces and norms; norms on R*; convergence, continuity, com- 
pleteness, and compactness in normed linear spaces; quotient spaces; the 
Lebesgue integral, including Fubini’s theorem and orthogonal expansions; 
and differentiable mappings. 
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